Overview of NTCIR-14 Makoto P. Kato Yiqun Liu University of - PowerPoint PPT Presentation

Overview of NTCIR-14 Makoto P. Kato Yiqun Liu University of Tsukuba Tsinghua University

Introduction to NTCIR • Evaluation Forum –An opportunity for researchers to get together to solve challenging research problems based on corporation of task organizers and participants: • Task organizers design tasks, prepare test collections, and evaluate participants’ systems • Participants develop systems to achieve better performances in the tasks Task organizers Participants Tasks Test collections System output Evaluation results �

Benefits of Evaluation Forums • Task organizers – Can obtain many findings on a certain problem – Can share workload of building a large-scale test collection – Can get attention to a certain research direction • Participants – Can focus on solving problems – Can tackle well-recognized problems with new resources, or novel problems at an early stage – Can demonstrate the performance of their developed systems in a fair comparison • Both sides should have benefits �

Task Selection Procedure • PC co-chairs requested PC members to review task proposals from task organizers • PC co-chairs made decision based on the reviews Program committee Task organizers PC members Task Reviews proposals Decision PC co-chairs Please consider task proposals at NTCIR! �

NTCIR-14 Program Committee (PC) Ben Carterette University of Delaware, USA Hsin-Hsi Chen National Taiwan University, Taiwan Tat-Seng Chua National University of Singapore, Singapore Nicola Ferro University of Padova, Italy Kalervo Järvelin University of Tampere, Finland Gareth J. F. Jones Dublin City University, Ireland Mandar Mitra Indian Statistical Institute, India Douglas W. Oard University of Maryland, USA Maarten de Rijke University of Amsterdam, the Netherlands Tetsuya Sakai Waseda University, Japan Mark Sanderson RMIT University, Australia Ian Soboroff NIST, USA Emine Yilmaz University College London, United Kingdom �

Review Process • 7+2 NTCIR-14 task proposals, each of which was reviewed by 4 or more PC members • 6+1 tasks were accepted, of which 5 are core tasks and 1+1 are pilot tasks �

NTCIR-14 General Schedule Note that each task could have its own schedule Date Event Mar 20, 2018 NTCIR-14 Kickoff May 15, 2018 Task Registration Due Jun 2018 Dataset Release Jun-Jul 2018 Dry Run Aug-Oct 2018 Formal Run Feb 1, 2019 Evaluation Result Release Feb 1, 2019 Task overview paper release (draft) Mar 15, 2019 Submission due of participant papers May 1, 2019 Camera-ready participant paper due Jun 2019 NTCIR-14 Conference & EVIA 2019 in NII, Tokyo �

Focuses of NTCIR-14 1.Heterogeneous information access 2.Dialogue generation and analysis 3.Meta research on information access communities �

Search for questions OpenLiveQ Search for web pages Search WWW Search for lifelog data Lifelog Heterogeneous data Summarize dialog data QALab A! Summarize Generate dialogues Generate STC C… B? Understand Understand numeric info. Dialog data FinNum Reproduce the best practices Reproduce CENTRE

NTCIR-14 Tasks • Core Tasks – Lifelog-3: (Lifelog Serach Task) – OpenLiveQ-2: (Open Live Test for Question Retrieval) – QALab-PoliInfo: (Question Answering Lab for Political Information) – STC-3: (Short Text Conversation) – WWW-2: (We Want Web) • Pilot Tasks – CENTRE: (CLEF/NTCIR/TREC REproducibility) – FinNum: (Fine-Grained Numeral Understanding in Financial Tweet) ��

Number of Active Participants Task # QA Lab for Entrance Exam (QALab) (11, 12, 13) 13 → QA Lab for Political Information (QALab-PoliInfo) (14) Personal Lifelog Organisation & Retrieval (Lifelog) (12, 13, 14) 6 Short Text Conversation (STC) (12, 13, 14) 13 Open Live Test for Question Retrieval (OpenLiveQ) (13, 14) 4 We Want Web (WWW) (13, 14) 4 Fine-Grained Numeral Understanding in Financial Tweet 6 (FinNum) (14) CLEF/NTCIR/TREC REproducibility (CENTRE) (14) 1 Total 47 Active participants: Research groups submitted final results for evaluation ��

Jargon: Test Collection General test collection Input Expected output IR test collection Highly relevant ��1 ��1 Document Relevance Irrelevant Topics collection judgements Evaluate Input Indexed Output Search system �

Jargon: Training, Development, and Test Sets Training Dev. Test Evaluate Train Train Output System • Training set: can be used to tune parameters in the system • Dev. set: can be used to tune hyper-parameters in the system • Test set: cannot be used to tune the system, but can only be used for evaluating the output. ��

Jargons: Run / Dry Run / Formal Run • Run: A result of a single execution of a developed system. e.g. This team submitted a run. • Dry run: A preliminary trial for improving the task design and familiarizing participants with the task • Formal run: An actual trial where submissions and their results are officially recorded ��

Jargon: Evaluation Metric • A measure of the system performance – General evaluation metrics Correct items System output Output Judge " ! System Assessor Precision Recall F1-measure # = |! ∩ "| ' = |! ∩ "| ) = 2#' ( |"| |!| # + ' – IR evaluation metrics: MAP, nDCG, ERR, Q-measure – Summarization evaluation metrics: ROUGE Please Google or Bing for details. They will be used in the overview presentations. ��

ENJOY THE CONFERENCE! • Keynote (TODAY) • Task Overviews (TODAY) • Invited Talks (TODAY) • Task Sessions (DAY-3 and DAY-4) • Poster Sessions (DAY-3 and DAY-4) • Banquet (DAY-3) • Panel (DAY-4) • Break-out Sessions (DAY-4) ��

Overview of NTCIR-14 Makoto P. Kato Yiqun Liu University of - PowerPoint PPT Presentation

Overview of NTCIR-14 Makoto P. Kato Yiqun Liu University of Tsukuba Tsinghua University Introduction to NTCIR Evaluation Forum An opportunity for researchers to get together to solve challenging research problems based on corporation

NTCIR-9 Kick-Off Event ff 2010.10.05 : 13:30- English Session: 15:30-

Neuchatel at NTCIR-4 From CLEF to NTCIR Jacques Savoy University of Neuchatel, Switzerland

I t Introduction to NTCIR-7 d ti t NTCIR 7 N Noriko Kando k K d National Institute of

Overview of the Sixth NTCIR Workshop Noriko Kando National Institute of Informatics

Kyoto-U: Syntactical EBMT System for NTCIR 7 Patent System for NTCIR-7 Patent Translation Task

NTCIR 2014 Slides - TUW-IMP at the NTCIR-11 Math-2 Presentation February 2015 CITATIONS READS

KSU Teams QA System for World History Exams at the NTCIR-13 QA Lab-3 Task Tasuku Kimura, Ryo

Overview of the 7 th NTCIR f Workshop N Noriko Kando k K d National Institute of

Analysis of Similarity Measures between Short Text for the NTCIR-12 Short Text Conversation Task

IASL System for NTCIR-6 Korean-Chinese CLIR Yu-Chun Wang Cheng-Wei Lee Richard Tzong-Han Tsai

RMIT at the NTCIR-13 We Want Web Task Luke Gallagher with Joel Mackenzie, Rodger Benham,

NiCT/ATR in NTCIR-7 CCLQA Track Youzheng WU, Wenliang CHEN, Hideki KASHIOKA NiCT/ATR, Japan

SG01 at the NTCIR-13 STC-2 task Haizhou Zhao , Yi Du, Hangyu Li, Qiao Qian, Hao Zhou, Minlie

CUTKB at NTCIR-14 QALab-PoliInfo Task Toshiki Tomihira and Yohei Seki University of Tsukuba,

Overview of Patent Retrieval Task at NTCIR-4 Atsushi Fujii (Univ. of Tsukuba) Makoto Iwayama

NTCIR-7 MOAT Overview Yohei Seki, Lun-Wei Ku, David Kirk Evans, Le Sun 1 Opinion Analysis

STELLA: Towards a Framework for the Reproducibility of Online Search Experiments OSIRRC 2019

Centre for Global Atmospheric Modelling NERC Centres for Atmospheric Science Department of

Discovering Relational Specifications by Calvin Smith, Gabriel Ferns, Aws Albarghouthi Muqsit

The BisimDist Library Efficient Computation of Bisimilarity Distances for Markovian Models

Scattering amplitudes from the amplituhedron NMHV volume forms Andrea Orta

ttr s s

GraphBlast: multi-feature graphs database searching Alfredo Ferro, Rosalba Giugno, Misael

Document Misplacement for IR Evaluation Nicola Ferro Information Management Systems (IMS)