Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC - PowerPoint PPT Presentation

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019) Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, Ze Zhong Wu

Vision Source: saveur.com

Vision The ultimate candy store for information retrieval researchers! Source: Wikipedia (Candy)

Vision The ultimate candy store for information retrieval researchers! See a result you like? Click a button to recreate those results! Really, any result? (not quite… let’s start with batch ad hoc retrieval experiments on standard test collections) What is this, really?

Repeatability: you can recreate your own We get this “for free” results again Replicability: others can recreate your results (with your code) Our focus Reproducibility: others can recreate your results (with code they rewrite) Stepping stone… ACM Artifact Review and Badging Guidelines

Why is this important? Good science Sustained cumulative progress Armstrong et al. (CIKM 2009): Little empirical progress made from 1998 to 2009 Why? researchers compare against weak baselines Yang et al. (SIGIR 2019): Researchers still compare against weak baselines

How do we get there? Open-Source Code! … h g u o n e m o f r r a f t u b t , r a s t d o o g A TREC 2015 “Open Runs” 79 submitted runs… Voorhees et al. Promoting Repeatability Through Open Runs. EVIA 2016.

0 Number of runs successfully replicated Voorhees et al. Promoting Repeatability Through Open Runs. EVIA 2016.

How do we get there? Open-Source Code! … h g u o n e m o f r r a f t u b t , r a s t d o o g A Ask developers to show us how! Open-Source IR Reproducibility Challenge (OSIRRC), SIGIR 2015 Workshop on Reproducibility, Inexplicability, and Generalizability of Results (RIGOR) Participants contributed end-to-end scripts for replicating ad hoc retrieval experiments Lin et al. Toward Reproducible Baselines: The Open-Source IR Reproducibility Challenge. ECIR 2016.

System E ff ectiveness 0.75 0.50 MAP 0.25 0.00 5 L P L B P 5 5 + M M D 5 H ) ) E . t 2 2 2 2 Q Q s Q B S P n M M M D D M M : B o u J D P S S x : : 1 : 1 4 o B 5 B B B o i J o ( G o r C . g : 4 2 : : r : B d S r 5 : o i : : a M . G P ( n E r J e r S t 2 + l : g n d + e a S I 4 i 5 A R M M a r i a n G H G 2 S l r H r J I u a I M B r A e P T M G P T e Q T J A D B D : e : E n : : : e r R e r n e c e i I e u i r T r c L r A r u e e T L T System / Model 7 participating systems, GOV2 collection

System E ffi ciency 100,000 10,000 Search Time (ms) 1,000 100 10 1 P B P + 5 ) ) 5 5 5 H L D L E M M t . 2 2 2 2 s Q Q Q B n P S M M M M D D M : B o u J D P x S S 1 : : : 1 4 o 5 B B B B J o i ( o G o C r . : 4 g 2 : r B : : S d 5 r : : o i M G . ( : a P E J e n r S t 2 r + : l g n + d S 5 4 e i a I A M M R a r a G i H n 2 G S l r r H J I u M a I B A r e P T M T P G Q e J A T D B : D e : E n : : e : r R e r n e c e i I e u i r T c r L r A u r e e T L T System / Model 7 participating systems, GOV2 collection

E ff ectiveness/E ffi ciency Tradeo ff 10000 Indri: SDM Galago: SDM Terrier: DPH+Bo1 QE Indri: QL Terrier: DPH+Prox SD 1000 Time (ms) Galago: QL Terrier: DPH Terrier: BM25 MG4J: BM25 ATIRE: BM25 Lucene: BM25 (Pos.) Lucene: BM25 (Count) 100 ATIRE: Quant. BM25 MG4J: B+ JASS: 1B P MG4J: B JASS: 2.5M P 8 0 2 4 2 3 3 3 . . . . 0 0 0 0 MAP 7 participating systems, GOV2 collection

How do we get there? Open-Source Code! … h g u o n e m o f r r a f t u b t , r a s t d o o g A Ask developers to show us how! I t w o r k e d , b u t …

What worked well? We actually pulled it off! What didn’t work well? Technical infrastructure was brittle Replication scripts too under-constrained

Infrastructure Source: Wikipedia (Burj Khalifa)

VMs App App OS OS VM VM hypervisor Physical Machine

Containers App App Container Container Container Engine OS Physical Machine

Infrastructure Source: Wikipedia (Burj Khalifa)

Workshop Goals 1. Develop common Docker specification for capturing ad hoc retrieval experiments – the “jig”. 2. Build a library of curate images that work with the jig. 3. Take over the world! (encourage adoption, broaden to other tasks, etc.)

jig Docker image User specifies <image>:<tag> Starts image prepare phase Triggers hook init hook Triggers hook index hook Creates snapshot <snapshot> <image>:<tag> Triggers hook with snapshot search hook search run files phase trec_eval

Source: Flickr (https://www.flickr.com/photos/m00k/15789986125/)

17 images 13 different teams Focus on newswire collections: Robust04, Core17, Core18 Official runs on Microsoft Azure f t o s o r c M i k s n a h T ! s i t d e r c e e r f r o f

Anserini (University of Waterloo) Anserini-bm25prf (Waseda University) ATIRE (University of Otago) Birch (University of Waterloo) Elastirini (University of Waterloo) EntityRetrieval (Ryerson University) Galago (University of Massachusetts) ielab (University of Queensland) Indri (TU Delft) IRC-CENTRE2019 (T echnische Hochschule Köln) JASS (University of Otago) JASSv2 (University of Otago) NVSM (University of Padua) OldDog (Radboud University) PISA (New York University and RMIT University) Solrini (University of Waterloo) T errier (TU Delft and University of Glasgow)

Robust04 49 runs from 13 images Images captured diverse models: query expansion and relevance feedback conjunctive and efficiency-oriented query processing neural ranking models

Core17 12 runs from 6 images

Core18 19 runs from 4 images

Robust04 49 runs from 13 images

Who won? Source: Time Magazine

But it’s not a competition! Source: Washington Post

TREC best – 0.333 TREC median (title) – 0.258

Workshop Goals ✓ 1. Develop common Docker specification for capturing ✓ ad hoc retrieval experiments – the “jig”. 2. Build a library of curate images that work with the jig. ? 3. Take over the world! (encourage adoption, broaden to other tasks, etc.)

What’s next? Source: flickr (https://www.flickr.com/photos/39414578@N03/16042029002)

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC - PowerPoint PPT Presentation

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019) Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, Ze Zhong Wu Vision Source: saveur.com Vision The ultimate candy store for information retrieval

A Docker-Based Replicability Study of a Neural Information Retrieval Model Nicola Ferro, Stefano

Replicability in Psychological Science WAN Ching, Catherine Associate Chair (Research), School of

Computational Reproducibility Daniel S. Katz Jennifer Freeman Smith Computational

Make Money With Open Source What is Open Source? Community Free software vs. open source

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

Open Source Databases Peter Zaitsev, CEO Percona What a Year! Huge changes for Open Source and

Automating Your Lights with Open Source Combining Open Source Hardware with Free and Open Source

The State of Open Source Databases Peter Zaitsev CEO, Percona October 1 st , 2019 Open Source

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018 Wait, isnt open

Creating Open Source Electronic Hardware with Open Source Software Tom Anderson Overview

Top Open Source Language Trends for 2019 ActiveState Webinar Open Source Language Trends 2019

February 27, 2013 Source: CBRE Source: CBRE Source: CBRE Source: CBRE Source: CBRE Miami

open source, open data, & personalized medicine Benaroya Research Institute open source

Building an open source culture Russell Green 20 November 2019 For internal use only Deutsche

Evaluating Treatment Effects and Replicability - work in progress - Victor Gonzalez-Jimenez

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Final Project M4 Monday, April 10, 2017 Agenda Reading Quiz Review M4 Requirements

Picking Sequences for Resource Allocation Sylvain Bouveret LIG Grenoble INP / Ensimag

Teaching Statistical Literacy: Ch 4 16 May 2019 Ch4: V1 Ch4: V1 2019 USCOTS Workshop 1 2019

The problems with holdout sets MODEL VALIDATION IN P YTH ON Kasey Jones Data Scientist

Distributed Systems Time and Global States MISM 95-702 Distributed Systems

Stephen English CPTEC Nigel Atkinson, Ed Pavelin, James Cameron, Brett Candy, Richard Marriott Met

IC220 Combinational Logic Slide Set #A2: Combinational and Multiplexors (mux)

Short term Foreign Exchange: England Summer 2012 BACK IN CALIFORNIA British Slang vs. American

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC - PowerPoint PPT Presentation

Overview of the 2019 Open-Source IR Replicability Challenge (OSIRRC 2019) Ryan Clancy, Nicola Ferro, Claudia Hauff, Jimmy Lin, Tetsuya Sakai, Ze Zhong Wu Vision Source: saveur.com Vision The ultimate candy store for information retrieval

A Docker-Based Replicability Study of a Neural Information Retrieval Model Nicola Ferro, Stefano

Replicability in Psychological Science WAN Ching, Catherine Associate Chair (Research), School of

Computational Reproducibility Daniel S. Katz Jennifer Freeman Smith Computational

Make Money With Open Source What is Open Source? Community Free software vs. open source

and Retrieval Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H. Jegou Source: H.

Open Source Databases Peter Zaitsev, CEO Percona What a Year! Huge changes for Open Source and

Automating Your Lights with Open Source Combining Open Source Hardware with Free and Open Source

The State of Open Source Databases Peter Zaitsev CEO, Percona October 1 st , 2019 Open Source

Open-source without headaches Edwin Dalmaijer @esdalmaijer 20 November 2018 Wait, isnt open

Creating Open Source Electronic Hardware with Open Source Software Tom Anderson Overview

Top Open Source Language Trends for 2019 ActiveState Webinar Open Source Language Trends 2019

February 27, 2013 Source: CBRE Source: CBRE Source: CBRE Source: CBRE Source: CBRE Miami

open source, open data, &amp; personalized medicine Benaroya Research Institute open source

Building an open source culture Russell Green 20 November 2019 For internal use only Deutsche

Evaluating Treatment Effects and Replicability - work in progress - Victor Gonzalez-Jimenez

THE 3-R'S OF DATA- THE 3-R'S OF DATA- SCIENCE: SCIENCE: REPEATABILITY REPEATABILITY, ,

Final Project M4 Monday, April 10, 2017 Agenda Reading Quiz Review M4 Requirements

Picking Sequences for Resource Allocation Sylvain Bouveret LIG Grenoble INP / Ensimag

Teaching Statistical Literacy: Ch 4 16 May 2019 Ch4: V1 Ch4: V1 2019 USCOTS Workshop 1 2019

The problems with holdout sets MODEL VALIDATION IN P YTH ON Kasey Jones Data Scientist

Distributed Systems Time and Global States MISM 95-702 Distributed Systems

Stephen English CPTEC Nigel Atkinson, Ed Pavelin, James Cameron, Brett Candy, Richard Marriott Met

IC220 Combinational Logic Slide Set #A2: Combinational and Multiplexors (mux)

Short term Foreign Exchange: England Summer 2012 BACK IN CALIFORNIA British Slang vs. American

open source, open data, & personalized medicine Benaroya Research Institute open source