Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast - PowerPoint PPT Presentation

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast Matthias Busse Jakob Gomoll Jannis Harder Benno Stein Bauhaus-Universit¨ at Weimar matthias.hagen@uni-weimar.de TREC 2012 Gaithersburg November 9, 2012 Hagen et al. Webis at the TREC 2012 Session track 1

Two research questions . . . Hagen et al. Webis at the TREC 2012 Session track 2

Question 1: query expansion depending on session type “Low risk”session QE might be beneficial Low risk of misunderstanding Hagen et al. Webis at the TREC 2012 Session track 3

Question 1: query expansion depending on session type “Low risk”session “High risk”session QE might be beneficial QE considered harmful Low risk of misunderstanding High risk of misunderstanding Hagen et al. Webis at the TREC 2012 Session track 3

Question 2: knowledge from other users’ sessions Sessions with same goals Hagen et al. Webis at the TREC 2012 Session track 4

Two standard retrieval models [ chatnoir.webis.de ] [ boston.lti.cs.cmu.edu/Services/ ] BM25F + PageRank + Language modeling + Proximity inference network Used in runs 1 and 3 Used in run 2 Hagen et al. Webis at the TREC 2012 Session track 5

Runs 1 and 2: query expansion by session types Compare current query q to each previous query If q is not a repetition, generalization, or specialization, then populate Q : previous queries previous results (documents) R : previous snippets S : previous titles T : Query expansion approach at most two keyphrases from Q RL2: additionally at most one keyphrase from each R , S , T RL3: only clicked results in R , S , T RL4: Weights: 2.0 from q , 0.6 from Q , 0 . 2 from R , 0 . 1 from S or T Hagen et al. Webis at the TREC 2012 Session track 6

Runs 1 and 2: postprocessing Result list postprocessing Aspect sessions: show Wikipedia VIP segments: find long Wikipedia title in q , show article results from similar sessions at rank 3 and 4 Clicks: Long documents: remove when ≥ 7000 words Duplicates: remove when 5-gram cosine similarity ≥ 0.98 Run 2 Indri instead of ChatNoir Query segmentation [Hagen et al., CIKM 2012] Hagen et al. Webis at the TREC 2012 Session track 7

Runs 1 and 2: nDCG@10 influence RL1 RL2 RL3 RL4 run 1 (ChatNoir) 0.0865 0.1174 ⇑ 0.1204 ⇑ 0.1171 ⇑ run 2 (Indri) 0.2053 0.2097 ↑ 0.2102 ↑ 0.2077 ↑ Observations ChatNoir’s initial performance rather low ChatNoir (BM25F) significantly benefits from risk-aware QE Indri (LM) benefits (not statistically significant) Hagen et al. Webis at the TREC 2012 Session track 8

Run 3: knowledge from other users’ sessions Search shortcuts [Baraglia et al., RecSys 2009] Query expansion with terms from related sessions RGU-ISTI-Essex team used Microsoft RFP 2006 log Performance gain not significant Not many related sessions found?! Our idea Use TREC sessions as source, and Manual creation of more related sessions (three for sessions 1, 3, 8, 34, 38, 46, 53, 64, 66, 69, and 92) Should count as manual run?! Hagen et al. Webis at the TREC 2012 Session track 9

Run 3: query expansion + postprocessing Query expansion Analogous to runs 1 and 2, but Q , R , S , and T populated from related sessions only Result list postprocessing Analogous to runs 1 and 2, but Top ranks populated with clicks from related sessions only Hagen et al. Webis at the TREC 2012 Session track 10

Run 3: nDCG@10 influence RL1 RL2 RL3 RL4 run 1 (same session) 0.0865 0.1174 ⇑ 0.1204 ⇑ 0.1171 ⇑ run 3 (other sessions) 0.1086 0.1220 ⇑ 0.1401 ⇑ 0.1796 ⇑ Observations Other users’ sessions can help a lot (risk-aware) More than the same users’ previous interactions Hagen et al. Webis at the TREC 2012 Session track 11

Run 3: the best from both worlds?! Low risk + related sessions Hagen et al. Webis at the TREC 2012 Session track 12

Almost the end: The take-home messages! Hagen et al. Webis at the TREC 2012 Session track 13

What we have done Main results Future work Risk-aware session type consideration More fine-grained types → mostly performance gains, ֒ Other retrieval models hardly any losses QE techniques When to step in? Impact on standard retrieval models → BM25F ⇑ vs. Indri ↑ ֒ Other users’ sessions → 65% improvement for BM25F ֒ Hagen et al. Webis at the TREC 2012 Session track 14

What we have (not) done Main results Future work Risk-aware session type consideration More fine-grained types → mostly performance gains, ֒ Other retrieval models hardly any losses QE techniques When to step in? Impact on standard retrieval models → BM25F ⇑ vs. Indri ↑ ֒ Other users’ sessions → 65% improvement for BM25F ֒ Hagen et al. Webis at the TREC 2012 Session track 14

What we have (not) done Main results Future work Risk-aware session type consideration More fine-grained types → mostly performance gains, ֒ Other retrieval models hardly any losses QE techniques When to step in? Impact on standard retrieval models → BM25F ⇑ vs. Indri ↑ ֒ Thank you Other users’ sessions → 65% improvement for BM25F ֒ � Hagen et al. Webis at the TREC 2012 Session track 14

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast - PowerPoint PPT Presentation

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast Matthias Busse Jakob Gomoll Jannis Harder Benno Stein Bauhaus-Universit at Weimar matthias.hagen@uni-weimar.de TREC 2012 Gaithersburg November 9, 2012 Hagen et al.

Regional Trec - September 27, 2015 - Cadogan Farms TREC Workshop April 2015 Regional TREC

Overview of TREC 2014 Ellen Voorhees Text REtrieval Conference (TREC) TREC 2014 Track

AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour AutoAdapt @ TREC 2010 The

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Text REtrieval Conference (TREC) TREC TRACKS Crowdsourcing Personal Blog, Microblog documents

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @

Status on positron fraction Multi-track event CC fitted Multi-track event 1 track Multi-Track

TREC Deep Learning Track Nick Craswell (Microsoft), Bhaskar Mitra (Microsoft and UCL), Emine Yilmaz

Session B.3 Spectrum Efficient Technologies Track Chair: Mr. Tom Young Track Chair Track

Community Power in Ontario The Road Ahead Clean Air Council November 24, 2017 20 year

TREC 2003 Tracks A Tale of Two Evaluat ions Retrieval in a domain Genome Novelty Answers,

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Presentation Schedule Track Wise Session Wise List of Papers S.No. Track Session Paper

USI at the TREC 2015 Contextual Suggestion Track Mohammad Aliannejadi Seyed Ali Bahrainian Fabio

CSCI 6730 / 4730 Operating Systems Processes Maria Hybinette, UGA Review Operating System

Operating Systems Processes Maria Hybinette, UGA Maria Hybinette, UGA Review Operating

Introduction to Geometry Return to Table of Contents Slide 6 / 209 The Origin of Geometry

Microprocessors & Interfacing Assembler directives Assembler expressions Macros

Input part 3: Interaction Techniques Interaction techniques A method for carrying out a

Polynomial-Time What-If Analysis for Prefix-Manipulating MPLS Networks and Segment Routing!

Lecture 26/Chapter 22 Hypothesis Tests for Proportions Null and Alternative Hypotheses

Prologue Prologue Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast - PowerPoint PPT Presentation

Webis at the TREC 2012 Session track Matthias Hagen Martin Potthast Matthias Busse Jakob Gomoll Jannis Harder Benno Stein Bauhaus-Universit at Weimar matthias.hagen@uni-weimar.de TREC 2012 Gaithersburg November 9, 2012 Hagen et al.

Regional Trec - September 27, 2015 - Cadogan Farms TREC Workshop April 2015 Regional TREC

Overview of TREC 2014 Ellen Voorhees Text REtrieval Conference (TREC) TREC 2014 Track

AutoAdapt @ TREC 2010 Dyaa Albakour October 7, 2010 Dyaa Albakour AutoAdapt @ TREC 2010 The

TREC, TAC, takeoffs, tacks, tasks, and titillations for 2009 Ian Soboroff, NIST

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image

Overview of TREC 2013 Ellen Voorhees Text REtrieval Conference (TREC) Back to our roots, writ

Text REtrieval Conference (TREC) TREC TRACKS Crowdsourcing Personal Blog, Microblog documents

PAN 2010 Uncovering Plagiarism, Authorship, and Social Software Misuse Webis @

Status on positron fraction Multi-track event CC fitted Multi-track event 1 track Multi-Track

TREC Deep Learning Track Nick Craswell (Microsoft), Bhaskar Mitra (Microsoft and UCL), Emine Yilmaz

Session B.3 Spectrum Efficient Technologies Track Chair: Mr. Tom Young Track Chair Track

Community Power in Ontario The Road Ahead Clean Air Council November 24, 2017 20 year

TREC 2003 Tracks A Tale of Two Evaluat ions Retrieval in a domain Genome Novelty Answers,

Beyond TREC-QA Ling573 NLP Systems and Applications May 28, 2013 Roadmap Beyond

Presentation Schedule Track Wise Session Wise List of Papers S.No. Track Session Paper

USI at the TREC 2015 Contextual Suggestion Track Mohammad Aliannejadi Seyed Ali Bahrainian Fabio

CSCI 6730 / 4730 Operating Systems Processes Maria Hybinette, UGA Review Operating System

Operating Systems Processes Maria Hybinette, UGA Maria Hybinette, UGA Review Operating

Introduction to Geometry Return to Table of Contents Slide 6 / 209 The Origin of Geometry

Microprocessors &amp; Interfacing Assembler directives Assembler expressions Macros

Input part 3: Interaction Techniques Interaction techniques A method for carrying out a

Polynomial-Time What-If Analysis for Prefix-Manipulating MPLS Networks and Segment Routing!

Lecture 26/Chapter 22 Hypothesis Tests for Proportions Null and Alternative Hypotheses

Prologue Prologue Yuval Shavitt School of Electrical Engineering shavitt@eng.tau.ac.il

Microprocessors & Interfacing Assembler directives Assembler expressions Macros