Soumyajit Gupta, Mucahid Kutlu, Vivek Khetan, and Matthew Lease ECIR - PowerPoint PPT Presentation

Sep 29, 2022 •350 likes •504 views

Soumyajit Gupta, Mucahid Kutlu, Vivek Khetan, and Matthew Lease ECIR 2019, Cologne, Germany, So many metrics 2 More than 100 metrics Limited time and space to report all Which ones should we report? Challenge in system comparisons

Soumyajit Gupta, Mucahid Kutlu, Vivek Khetan, and Matthew Lease ECIR 2019, Cologne, Germany,
So many metrics… 2 ▸ More than 100 metrics ▸ Limited time and space to report all
Which ones should we report?
Challenge in system comparisons 4 Taken from two different papers If paper A reports metric X and paper B reports metric Y on the same collection, how can I know which one is better?
Some ideas.. 5 ▸ Run them again on the collection ▸ Do they share their code? ▸ Implement the methods ▸ Is it well explained in the paper? ▸ Check if there is any common baseline used against and compare indirectly?
Our Proposal 6 ▸ Wouldn’t be nice to predict a system performance based on metric X using its performance on other metrics as features? ▸ Here is the general idea ▸ Build a classifier using only metric scores as features ▸ Predict the unknown metric using the known ones ▸ Compare systems based on predicted score with some confidence value ▸ Going back to our example: ▸ Predict A’s P@20 score using its MAP, P210, P@30 and NDCG score ▸ Compare A’s predicted P@20 with B’s actual P@20
Correlation between Metrics 7
Prediction 8 ▸ Goal: investigate which K evaluation metric(s) are the best predictors for a particular metric ▸ Training data: System average scores over topics in WT2000-01, RT2004, WT2010-11 collections. ▸ Test data: WT2012, WT2013, and WT2014 ▸ Learning algorithms: Linear Regression and SVM ▸ Approach: ▸ For a particular metric, we try all combinations of size K using other evaluation metrics on WT2012 ▸ Pick the highest and apply it on WT2013 and WT2014
Prediction Results 9
Which metrics should I report?
Ranking Metrics 11 ▸ Metrics do have correlation ▸ Why do we need to report correlated ones? ▸ Goal: Report the most informative set of metrics ▸ NP-Hard problem ▸ Iterative Backward Strategy: ▸ Start with a full set of covariance of metrics ▸ Iteratively prune less informative ones ▸ Remove the one that yields maximum entropy without it ▸ Greedy Forward Strategy ▸ Start with a empty set ▸ Greedily add most informative ones ▸ Pick the metric that is most correlated with all the remaining ones
Metrics ranked by each algorithm 12
Conclusion 13 ▸ Quantified correlation between 23 popular IR metrics on 8 TREC test collections ▸ Showed that accurate prediction of MAP, P@10, and RBP can be achieved using 2-3 other metrics ▸ Presented a model for ranking evaluation metrics based on covariance, enabling selection of a set of metrics that are most informative and distinctive.
Thank you! 14 This work was funded by the Qatar National Research Fund, a member of Qatar Foundation.

Recommend

Why Is That Relevant? Collecting Annotator Rationales for Relevance Judgments Presenter: Tyler

Why Is That Relevant? Collecting Annotator Rationales for Relevance Judgments Presenter: Tyler McDonnell Department of Computer Science The University of Texas at Austin Tyler McDonnell, Matthew Lease, Mucahid Kutlu, Tamer Elsayed 2016 AAAI

560 views • 27 slides

Galen Framework - Responsive Design Look and Feel Automation - Deepshikha Singh - Soumyajit Basu

Galen Framework - Responsive Design Look and Feel Automation - Deepshikha Singh - Soumyajit Basu Why Galen Framework ? Layout Testing Responsive Design Testing Cross Browser Testing Pros and Cons with Galen Pros: Open Source

434 views • 19 slides

Cosmology in the Multi-messenger Era Nandita Khetan Supervised by Marica Branchesi and tutored

Cosmology in the Multi-messenger Era Nandita Khetan Supervised by Marica Branchesi and tutored by Luca Izzo 2nd year Examination, 10th Oct 2019 1 Outline Introduction and background Motivation, basic idea of my main project

745 views • 32 slides

Can Biology Inspire Better Circuit Design? The RF Cochlea as a Case Study Soumyajit Mandal

Can Biology Inspire Better Circuit Design? The RF Cochlea as a Case Study Soumyajit Mandal soumya@mit.edu Overview Introduction Biologically-inspired systems The RF cochlea Conclusion Motivations Emulation: Biology solves

681 views • 55 slides

SOCIAL ENGINEERING - HOW NOT TO BE A VICTIM! BHUSHAN GUPTA GUPTA CONSULTING, LLC.

SOCIAL ENGINEERING - HOW NOT TO BE A VICTIM! BHUSHAN GUPTA GUPTA CONSULTING, LLC. WWW.BGUPTA.COM WHAT IS YOUR PASSWORD? Jimmy Kimmel Live @Gupta Consulting, LLC. www.bgupta.com 2 JIMMY KIMMEL LIVE - OBSERVATIONS Most Common Password

965 views • 62 slides

Second-Order Masked Lookup Table Compression Scheme Annapurna Valiveti , Srinivas Vivek IIIT

Second-Order Masked Lookup Table Compression Scheme Annapurna Valiveti , Srinivas Vivek IIIT Bangalore annapurna@iiitb.org, srinivas.vivek@iiitb.ac.in 14-17 September, CHES 2020 Annapurna Valiveti, Srinivas Vivek Second-Order Masked Lookup

806 views • 52 slides

E. Gabriels solution to Josephs dilemma Matthew 1:18 25 1. Matthew 1:18a Matthew

E. Gabriels solution to Josephs dilemma Matthew 1:18 25 1. Matthew 1:18a Matthew began explaining Jesus unique and supernatural birth. 2. Matthew 1:18b As mentioned earlier, in a Jewish marriage if either the man or woman

917 views • 73 slides

Eigenvalues and Eigenvectors Raibatak Sen Gupta 2019 Eigenvalues Characteristic Equation and

Eigenvalues and Eigenvectors Raibatak Sen Gupta Eigenvalues and Eigenvectors Raibatak Sen Gupta 2019 Eigenvalues Characteristic Equation and Eigenvectors Raibatak Sen Gupta Let A be an n n matrix. Then det ( A xI n ) gives a

355 views • 24 slides

Politics of Fiscal Policy: What do we know Sanjeev Gupta XXX REGIONAL FISCAL POLICY SEMINAR S

Politics of Fiscal Policy: What do we know Sanjeev Gupta XXX REGIONAL FISCAL POLICY SEMINAR S antiago 26-28 March 2018 Sanjeev Gupta | CGDev.org Drawn From a Recent Book Published by the IMF Last Year Sanjeev Gupta | CGDev.org | 1 Outline

257 views • 21 slides

CHEMOTHERAPY FOR BONE SARCOMAS BONE SARCOMAS ABHA GUPTA MD ABHA GUPTA, MD PRINCESS MARGARET

CHEMOTHERAPY FOR BONE SARCOMAS BONE SARCOMAS ABHA GUPTA MD ABHA GUPTA, MD PRINCESS MARGARET HOSPITAL PRINCESS MARGARET HOSPITAL HOSPITAL FOR SICK CHILDREN Incidence of Bone Sarcomas, , SEER 1975-2000 Proportion of Newly Diagnosed

802 views • 46 slides

DieCast: Testing Distributed Systems with an Accurate Scale Model Diwaker Gupta Diwaker Gupta

DieCast: Testing Distributed Systems with an Accurate Scale Model Diwaker Gupta Diwaker Gupta Kashi V. Vishwanath Amin Vahdat University of California, San Diego High performance Alice filesystem Limited testing infrastructure Diverse

590 views • 29 slides

EUV Lithography Introduction, Status and Challenges Vivek Bakshi, Ph.D. EUV Litho Inc. 10202

EUV Lithography Introduction, Status and Challenges Vivek Bakshi, Ph.D. EUV Litho Inc. 10202 Womack Road, Austin, TX 78748 USA www.euvlitho.com vivek.bakshi@euvlitho.com Outline Introduction to EUV Lithography Technical Status and

825 views • 65 slides

Google Matrix Analysis of DNA Sequences Vivek Kandiah and Dima Shepelyansky Laboratoire de

Google Matrix Analysis of DNA Sequences Vivek Kandiah and Dima Shepelyansky Laboratoire de Physique Thorique, IRSAMC, UMR 5152 du CNRS Universit Paul Sabatier, Toulouse Supported by EC FET open project NADINE 14 june 2013 Vivek Kandiah and

184 views • 17 slides

Google Matrix Analysis of DNA Sequences Vivek Kandiah and Dima Shepelyansky Laboratoire de

235 views • 12 slides

Linear algebra and differential equations (Math 54): Lecture 1 Vivek Shende January 22, 2019

Linear algebra and differential equations (Math 54): Lecture 1 Vivek Shende January 22, 2019 Hello and welcome to class! I am Vivek Shende I will be teaching you this semester. My office hours 2-4 pm on Friday, 873 Evans hall. Come ask

1.57k views • 155 slides

Calculus (Math 1A) Lecture 1 Vivek Shende August 23, 2017 Hello and welcome to class! I am

Calculus (Math 1A) Lecture 1 Vivek Shende August 23, 2017 Hello and welcome to class! I am Vivek Shende I will be teaching you this semester. My office hours Starting next week: 1-3 pm on tuesdays; 2-3 pm fridays 873 Evans hall. Come ask

2.14k views • 144 slides

Introduction to the Million Songs Dataset Jamen Long Data Scientist DataCamp Building

DataCamp Building Recommendation Engines with PySpark BUILDING RECOMMENDATION ENGINES WITH PYSPARK Introduction to the Million Songs Dataset Jamen Long Data Scientist DataCamp Building Recommendation Engines with PySpark Explicit vs Implicit

467 views • 36 slides

Google matrix of the world trade network Leonardo Ermann CNEA (Buenos Aires, Argentina) Colab.

Google matrix of the world trade network Leonardo Ermann CNEA (Buenos Aires, Argentina) Colab. Dima Shepelyansky July 24th 2012, Spectral properties of complex networks ECT, Trento supported by EC FET Open project NADINE Outline

905 views • 45 slides

Oberseminar Convergence Mechanisms for a Smart Space App Store Bibek Shrestha

Lehrstuhl fr Netzarchitekturen und Netzdienste Institut fr Informatik Technische Universitt Mnchen Oberseminar Convergence Mechanisms for a Smart Space App Store Bibek Shrestha bibek.shrestha@tum.de Under supervision of Marc-Oliver

674 views • 22 slides

Transfer to Rank for Heterogeneous One-Class Collaborative Filtering Weike Pan 1 , Qiang Yang 2

Transfer to Rank for Heterogeneous One-Class Collaborative Filtering Weike Pan 1 , Qiang Yang 2 , Wanling Cai 1 , Yaofeng Chen 2 , Qing Zhang 1 , Xiaogang Peng 1 and Zhong Ming 1 panweike@szu.edu.cn, qyang@cse.ust.hk, wanling

424 views • 29 slides

http://cs246.stanford.edu Training data 100 million ratings, 480,000 users, 17,770 movies

CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu Training data 100 million ratings, 480,000 users, 17,770 movies 6 years of data: 2000-2005 Test data Last few ratings of each user (2.8

923 views • 53 slides

Course Introduction Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

Introduction and Logistics Course Goals Administrative Items Course Introduction Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of Computer Science Introduction and Logistics Course Goals Administrative Items

396 views • 22 slides

Unplanned Returns to Hospital Care: A Linked Data Study Kathy SMITH 1 and Renee IANNOTTI Health

Unplanned Returns to Hospital Care: A Linked Data Study Kathy SMITH 1 and Renee IANNOTTI Health System Information and Performance Reporting Branch, NSW Ministry of Health Abstract. The linkage of data across facilities and settings of care

371 views • 7 slides

Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning Lecture 11:

Natural Language Processing with Deep Learning CS224N/Ling284 Christopher Manning Lecture 11: ConvNets for NLP Lecture Plan Lecture 11: ConvNets for NLP 1. Announcements (5 mins) 2. Intro to CNNs (20 mins) 3. Simple CNN for Sentence

944 views • 57 slides