Evaluation in Information Retrieval Mandar Mitra Indian Statistical - PowerPoint PPT Presentation

Evaluation in Information Retrieval Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Evaluation in Information Retrieval 1 / 57

Outline Preliminaries 1 2 Metrics 3 Forums Tasks 4 Task 1: Morpheme extraction Task 2: RISOT Task 3: SMS-based FAQ retrieval Task 4: Microblog retrieval M. Mitra (ISI) Evaluation in Information Retrieval 2 / 57

. . . . . . . . . Motivation Which is better: Heap sort or Bubble sort? M. Mitra (ISI) Evaluation in Information Retrieval 3 / 57

Motivation Which is better: Heap sort or Bubble sort? vs. . Which is better? . or . M. Mitra (ISI) Evaluation in Information Retrieval 3 / 57

Motivation IR is an empirical discipline. M. Mitra (ISI) Evaluation in Information Retrieval 4 / 57

Motivation IR is an empirical discipline. Intuition can be wrong! “sophisticated” techniques need not be the best e.g. rule-based stemming vs. statistical stemming M. Mitra (ISI) Evaluation in Information Retrieval 4 / 57

Motivation IR is an empirical discipline. Intuition can be wrong! “sophisticated” techniques need not be the best e.g. rule-based stemming vs. statistical stemming Proposed techniques need to be validated and compared to existing techniques. M. Mitra (ISI) Evaluation in Information Retrieval 4 / 57

. . . Cranfield method ( CLEVERDON ET AL ., 60 S ) Benchmark data Document collection . Query / topic collection . Relevance judgments - information about which document is relevant to which query . M. Mitra (ISI) Evaluation in Information Retrieval 5 / 57

Cranfield method ( CLEVERDON ET AL ., 60 S ) Benchmark data syllabus Document collection . question paper Query / topic collection . Relevance judgments - information about which document is relevant to which query . correct answers . . . . M. Mitra (ISI) Evaluation in Information Retrieval 5 / 57

Cranfield method ( CLEVERDON ET AL ., 60 S ) Benchmark data syllabus Document collection . question paper Query / topic collection . Relevance judgments - information about which document is relevant to which query . correct answers . . . . Assumptions relevance of a document to a query is objectively discernible all relevant documents contribute equally to the performance measures relevance of a document is independent of the relevance of other documents M. Mitra (ISI) Evaluation in Information Retrieval 5 / 57

Outline Preliminaries 1 2 Metrics 3 Forums Tasks 4 Task 1: Morpheme extraction Task 2: RISOT Task 3: SMS-based FAQ retrieval Task 4: Microblog retrieval M. Mitra (ISI) Evaluation in Information Retrieval 6 / 57

Evaluation metrics Background User has an information need. Information need is converted into a query . Documents are relevant or non-relevant . Ideal system retrieves all and only the relevant documents. M. Mitra (ISI) Evaluation in Information Retrieval 7 / 57

Evaluation metrics Background User has an information need. Information need is converted into a query . Documents are relevant or non-relevant . Ideal system retrieves all and only the relevant documents. Information need User System Document Collection M. Mitra (ISI) Evaluation in Information Retrieval 7 / 57

Set-based metrics #( relevant retrieved ) = Recall #( relevant) #( true positives ) = #( true positives + false negatives) #( relevant retrieved ) Precision = #( retrieved) #( true positives ) = #( true positives + false positives) 1 F = α/P + (1 − α ) /R ( β 2 + 1) PR = β 2 P + R M. Mitra (ISI) Evaluation in Information Retrieval 8 / 57

Metrics for ranked results (Non-interpolated) average precision Which is better? 1 Non-relevant 1 Relevant 2 Non-relevant 2 Relevant 3 Non-relevant 3 Non-relevant 4 Relevant 4 Non-relevant 5 Relevant 5 Non-relevant M. Mitra (ISI) Evaluation in Information Retrieval 9 / 57

Metrics for ranked results (Non-interpolated) average precision Rank Type Recall Precision 1 Relevant 0.2 1.00 2 Non-relevant 3 Relevant 0.4 0.67 4 Non-relevant 5 Non-relevant 6 Relevant 0.6 0.50 M. Mitra (ISI) Evaluation in Information Retrieval 10 / 57

Metrics for ranked results (Non-interpolated) average precision Rank Type Recall Precision 1 Relevant 0.2 1.00 2 Non-relevant 3 Relevant 0.4 0.67 4 Non-relevant 5 Non-relevant 6 Relevant 0.6 0.50 Relevant 0.8 0.00 ∞ Relevant 1.0 0.00 ∞ M. Mitra (ISI) Evaluation in Information Retrieval 10 / 57

Metrics for ranked results (Non-interpolated) average precision Rank Type Recall Precision 1 Relevant 0.2 1.00 AvgP = 1 5 ( 1 + 2 3 + 3 2 Non-relevant 6 ) 3 Relevant 0.4 0.67 4 Non-relevant 5 Non-relevant (5 relevant docs. in all) 6 Relevant 0.6 0.50 Relevant 0.8 0.00 ∞ Relevant 1.0 0.00 ∞ M. Mitra (ISI) Evaluation in Information Retrieval 10 / 57

Metrics for ranked results (Non-interpolated) average precision Rank Type Recall Precision 1 Relevant 0.2 1.00 AvgP = 1 5 ( 1 + 2 3 + 3 2 Non-relevant 6 ) 3 Relevant 0.4 0.67 4 Non-relevant 5 Non-relevant (5 relevant docs. in all) 6 Relevant 0.6 0.50 Relevant 0.8 0.00 ∞ Relevant 1.0 0.00 ∞ 1 i ∑ AvgP = N Rel Rank ( d i ) d i ∈ Rel M. Mitra (ISI) Evaluation in Information Retrieval 10 / 57

Metrics for ranked results Interpolated average precision at a given recall point 1 Recall points correspond to N Rel N Rel different for different queries P Q1 (3 rel. docs) Q2 (4 rel. docs) 0.0 1.0 R Interpolation required to compute averages across queries M. Mitra (ISI) Evaluation in Information Retrieval 11 / 57

Metrics for ranked results Interpolated average precision r ′ ≥ r P ( r ′ ) P int ( r ) = max M. Mitra (ISI) Evaluation in Information Retrieval 12 / 57

Metrics for ranked results Interpolated average precision r ′ ≥ r P ( r ′ ) P int ( r ) = max 11-pt interpolated average precision Rank Type Recall Precision 1 Relevant 0.2 1.00 2 Non-relevant 3 Relevant 0.4 0.67 4 Non-relevant 5 Non-relevant 6 Relevant 0.6 0.50 Relevant 0.8 0.00 ∞ Relevant 1.0 0.00 ∞ M. Mitra (ISI) Evaluation in Information Retrieval 12 / 57

Metrics for ranked results Interpolated average precision r ′ ≥ r P ( r ′ ) P int ( r ) = max 11-pt interpolated average precision R Interp. P Rank Type Recall Precision 0.0 1.00 0.1 1.00 1 Relevant 0.2 1.00 0.2 1.00 2 Non-relevant 0.3 0.67 3 Relevant 0.4 0.67 0.4 0.67 4 Non-relevant 0.5 0.50 5 Non-relevant 0.6 0.50 6 Relevant 0.6 0.50 0.7 0.00 Relevant 0.8 0.00 ∞ 0.8 0.00 Relevant 1.0 0.00 ∞ 0.9 0.00 1.0 0.00 M. Mitra (ISI) Evaluation in Information Retrieval 12 / 57

Metrics for ranked results 11-pt interpolated average precision 0.0 0.2 0.4 0.6 0.8 1.0 M. Mitra (ISI) Evaluation in Information Retrieval 13 / 57

Metrics for sub-document retrieval Let p r - document part retrieved at rank r rsize ( p r ) - amount of relevant text contained by p r size ( p r ) - total number of characters contained by p r T rel - total amount of relevant text for a given topic ∑ r i =1 rsize ( p i ) P [ r ] = ∑ r i =1 size ( p i ) r 1 ∑ R [ r ] = rsize ( p i ) T rel i =1 M. Mitra (ISI) Evaluation in Information Retrieval 14 / 57

Metrics for ranked results Precision at k (P@k) - precision after k documents have been retrieved easy to interpret not very stable / discriminatory does not average well R precision - precision after N Rel documents have been retrieved M. Mitra (ISI) Evaluation in Information Retrieval 15 / 57

Cumulated Gain Idea: Highly relevant documents are more valuable than marginally relevant documents Documents ranked low are less valuable M. Mitra (ISI) Evaluation in Information Retrieval 16 / 57

Cumulated Gain Idea: Highly relevant documents are more valuable than marginally relevant documents Documents ranked low are less valuable Gain ∈ { 0 , 1 , 2 , 3 } G = ⟨ 3 , 2 , 3 , 0 , 0 , 1 , 2 , 2 , 3 , 0 , . . . ⟩ i ∑ CG [ i ] = G [ i ] j =1 M. Mitra (ISI) Evaluation in Information Retrieval 16 / 57

(n)DCG DCG [ i ] = CG [ i ] if i < b DCG [ i − 1] + G [ i ] / log b i if i ≥ b M. Mitra (ISI) Evaluation in Information Retrieval 17 / 57

(n)DCG DCG [ i ] = CG [ i ] if i < b DCG [ i − 1] + G [ i ] / log b i if i ≥ b Ideal G = ⟨ 3 , 3 , . . . , 3 , 2 , . . . , 2 , 1 , . . . , 1 , 0 , . . . ⟩ DCG [ i ] nDCG [ i ] = Ideal DCG [ i ] M. Mitra (ISI) Evaluation in Information Retrieval 17 / 57

Mean Reciprocal Rank Useful for known-item searches with a single target Let r i — rank at which the “answer” for query i is retrieved. Then reciprocal rank = 1 /r i n 1 ∑ Mean reciprocal rank (MRR) = r i i =1 M. Mitra (ISI) Evaluation in Information Retrieval 18 / 57

Assumptions All relevant documents contribute equally to the performance measures. Relevance of a document to a query is objectively discernible. Relevance of a document is independent of the relevance of other documents. M. Mitra (ISI) Evaluation in Information Retrieval 19 / 57

Assumptions All relevant documents contribute equally to the performance measures. Relevance of a document to a query is objectively discernible. Relevance of a document is independent of the relevance of other documents. All relevant documents in the collection are known. M. Mitra (ISI) Evaluation in Information Retrieval 19 / 57

Evaluation in Information Retrieval Mandar Mitra Indian Statistical - PowerPoint PPT Presentation

Evaluation in Information Retrieval Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Evaluation in Information Retrieval 1 / 57 Outline Preliminaries 1 2 Metrics 3 Forums Tasks 4 Task 1: Morpheme extraction Task 2: RISOT

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS-7961: Topics in Information retrieval (IR) is finding material (usually

INFORMATION RETRIEVAL USING NEURAL NETWORKS VINEETH REDDY ANUGU CMSC 676 INFORMATION RETRIEVAL

Retrieval Max Gubin mail@maxgubin.com Information Retrieval History 4000 1950 2000 BC

Information Retrieval CS4611 Professor M. P. Schellekens Assistant: Ang Gao Slides adapted from

Ljubljana, Slovenia MultilingualWeb Workshop, Madrid, Oct 26 th 2010 Imagine a user

Hidden Markov Models (HMMs) Raymond J. Mooney University of Texas at Austin 1 Part Of Speech

Part 2: Boolean Retrieval Francesco Ricci Most of these slides comes from the course:

}w !"#$%&'()+,-./012345<yA| Illustraons by Ji Franek. Movaon Math

Adult Spinal Deformity Surgical Complications and Surgical intervention can have a significant

Real-World Evidence for Drug Effectiveness Evaluation: Addressing the Credibility Gap Richard

DATA BREACHES: HOW TO AVOID THEM AND WHAT TO DO IF IT Emory IRB Webinar January 8, 2015 HAPPENS

a 5-year retrospective study Maestri, R.; Parrini, M. Souza, A.B.; Rohsig, V. Hospital Moinhos

Evaluation in Information Retrieval Mandar Mitra Indian Statistical - PowerPoint PPT Presentation

Evaluation in Information Retrieval Mandar Mitra Indian Statistical Institute M. Mitra (ISI) Evaluation in Information Retrieval 1 / 57 Outline Preliminaries 1 2 Metrics 3 Forums Tasks 4 Task 1: Morpheme extraction Task 2: RISOT

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval Introducing Information Retrieval and Web Search

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS-7961: Topics in Information retrieval (IR) is finding material (usually

INFORMATION RETRIEVAL USING NEURAL NETWORKS VINEETH REDDY ANUGU CMSC 676 INFORMATION RETRIEVAL

Retrieval Max Gubin mail@maxgubin.com Information Retrieval History 4000 1950 2000 BC

Information Retrieval CS4611 Professor M. P. Schellekens Assistant: Ang Gao Slides adapted from

Ljubljana, Slovenia MultilingualWeb Workshop, Madrid, Oct 26 th 2010 Imagine a user

Hidden Markov Models (HMMs) Raymond J. Mooney University of Texas at Austin 1 Part Of Speech

Part 2: Boolean Retrieval Francesco Ricci Most of these slides comes from the course:

}w !&quot;#$%&amp;'()+,-./012345&lt;yA| Illustraons by Ji Franek. Movaon Math

Adult Spinal Deformity Surgical Complications and Surgical intervention can have a significant

Real-World Evidence for Drug Effectiveness Evaluation: Addressing the Credibility Gap Richard

DATA BREACHES: HOW TO AVOID THEM AND WHAT TO DO IF IT Emory IRB Webinar January 8, 2015 HAPPENS

a 5-year retrospective study Maestri, R.; Parrini, M. Souza, A.B.; Rohsig, V. Hospital Moinhos

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

}w !"#$%&'()+,-./012345<yA| Illustraons by Ji Franek. Movaon Math