Retrieval as Interaction African Summer School on Machine Learning - PowerPoint PPT Presentation

Retrieval as Interaction African Summer School on Machine Learning for Data Mining and Search Maarten de Rijke January 14, 2019 University of Amsterdam derijke@uva.nl

Based on joint work with Abhinav Khaitan, Ana Lucic, Anne Schuth, Boris Sharchilev, Branislav Kveton, Chang Li, Csaba Szepesv´ ari, Daan Odijk, Edgar Meij, Giorgio Stefanoni, Harrie Oosterhuis, Hinda Haned, Ilya Markov, Julia Kiseleva, Jun Ma, Kambadur Prabhanjan, Maartje ter Hoeve, Masrour Zoghi, Miles Osborne, Nikos Voskarides, Pavel Serdyukov, Pengjie Ren, Ridho Reinanda, Rolf Jagerman, Tor Lattimore, Yujie Lin, Yury Ustinovskiy, Zhaochun Ren, Ziming Li, and Zhumin Chen 1

Background 1

We need information to make decisions . . . . . . to identify or structure a problem or opportunity . . . to put problem or opportunity in context . . . to generate alternative solutions . . . to choose the best alternative 2

Information retrieval Getting the right information to the right people in the right way 3

Information retrieval – Two phases 4

Information retrieval – Two phases Online development O ffm ine development 5

Information retrieval – Two phases Online development O ffm ine development 6

Information retrieval – The online phase User environment document list action examine document list Retrieval evaluation system measure agent reward generate implicit feedback query state implicit feedback 7

How does it all fit together? A “spaghetti” picture for search O ffl ine Front door extraction indexing aggregation crawl/ingest query improvement UX enriching scheduler source source index Logs index Logs source index Logs Online Evaluation framework vertical rankers A/B query understanding learning top-k retrievers blender interleaving offline evaluation How does the AFIRM program fit? 8

What does the offline phase mean? A learning process for Man and Machine 9

What does this mean for machines? Sense – Plan – Act 10

What does this mean for machines? Understand and track intent 11

What does this mean for machines? Understand and track intent Update models and space of possible actions (answer, ranked list, SERP, . . . ) 11

What does this mean for machines? Understand and track intent Update models and space of possible actions (answer, ranked list, SERP, . . . ) Select the best action and sense its effect 11

What does this mean for machines? Life is easier for systems than in an offline trained query-response paradigm • Engage with user • Educate/train user • Ask for clarification from user 12

What does this mean for machines? Life is easier for systems than in an offline trained query-response paradigm • Engage with user • Educate/train user • Ask for clarification from user Life is harder for systems than in an offline trained query-response paradigm • Safety – Don’t hurt anyone • Explicability – Be transparent about model, about decisions 12

Unpacking Safety & Explicability 13

The plan for this morning Background Safety Explicability Conclusion 14

Safety 14

Safety Don’t perform worse than a reasonable baseline, e.g., production system people are used to Don’t take too long to learn to improve Don’t leave anyone behind & give everyone a fair deal Don’t fall into sinkholes – be diverse . . . 15

When people change their mind • Off-policy evaluation uses historical 0.4 interaction to estimate Reward 0.2 performance 0.0 • Non-stationary arises when user 0.4 preferences change over time Reward 0.2 • Idea : use decay average to correct for bias in traditional IPS 0.0 0.4 • Exponential decay IPS estimator Reward closely follow actual performance of 0.2 V IPS True V α IPS Adaptive V α IPS recommender on LastFM 0.0 0 200,000 400,000 600,000 800,000 1,000,000 • Standard IPS estimator fails to policy’s Time t approximate actual performance R. Jagerman et al.. When people change their mind. In WSDM 2019, to appear. 16

Safe online learning to re-rank via implicit click feedback PBM • Safely learn to re-rank in an online 10 5 setting 10 4 • Learn user preferences not from 10 3 Regret scratch but by combining strengths of online and offline settings 10 2 • Start with initial ranked list (possible 10 1 learned online) and improve it online 10 0 by gradually swapping high-ranked less 10 1 10 2 10 3 10 4 10 5 10 6 attractive for low-ranked more Step n attractive ones C. Li et al.. Safe online learning to re-rank via implicit click feedback. Under review. 17

Deep learning with logged bandit feedback • Play it safe by obtaining a lot more training data 15 Bandit-ResNet • Train deep networks from data collected using FullInfo ResNet with CrossE 14 a running system – orders of magnitude more 13 Error Rate (test) data 12 11 • How – counterfactual risk minimization 10 approach using an equivariant empirical risk 9 estimator with variance regularization 8 50000 100000 150000 200000 250000 • Resulting objective can be decomposed in a Number of Bandit-Feedback Examples way that allows stochastic gradient descent training T. Joachims et al.. Deep learning with logged bandit feedback. In ICLR 2018. 18

Dialogue generation: From imitation learning to inverse reinforcement learning • Making sure system responses are informative and engaging • Adversarial dialogue generation model that provides a more accurate and precise reward signal for generator training • An improvement of training stability of adversarial training by employing causal entropy regularization Z. Li et al.. Dialogue generation: From imitation learning to inverse reinforcement learning. In AAAI 2019, to appear. 19

Differentiable unbiased online learning to rank Dueling Bandit Gradient Descent – first PDGD – Unbiased, differentiable, able to online learning to rank method optimize neural ranking models 0 . 50 Seeing/Interacting Query perfect User 0 . 45 Displayed Results Weight #1 Learning 0 . 40 Document NDCG Document Ranking A Ranking B Document 0 . 35 Document Document Document Document Document 0 . 30 Interleaving Document Document DBGD (linear) PDGD (linear) Document Document Weight #2 0 . 25 DBGD (neural) PDGD (neural) MGD (linear) LambdaMart (offline) Learns slow; hits a ceiling; fails to 0 . 20 0 5000 10000 15000 20000 25000 30000 optimize neural models impressions H. Oosterhuis and M. de Rijke. Differentiable unbiased online learning to rank. In CIKM 2018. 20

The plan for this morning Background Safety Explicability Conclusion 21

Explicability 21

Explicability Are we “the patient” or “the doctor”? Are we the subject or the object of the interventions? Explicability • How does it work? − → Generate an explanation • How did we arrive at this decision? − → Especially when things go wrong 22

Faithfully explaining rankings in a news recommender system • Explain this ranked list – what were main features responsible for list • Find importance of ranking features by perturbing their values and by measuring to what degree the ranking changes due to the changes • Design and train a neural network that Explanations are faithful, real-time and do learns explanations generated by this not negatively impact engagement method and is sufficiently efficient to run in a production environment M. ter Hoeve et al.. Faithfully explaining rankings in a news recommender system. Under review. 23

Weakly-supervised contextualization of knowledge graph facts • Explain your outcome – not necessarily how you got to it • Better understands facts return from a knowledge graph, but offering additional contextual facts • First generate a set of candidate facts in the neighborhood of a given fact and then rank candidates using Generate training data automatically using supervised learning to rank distant supervision • Combine features learned from data with a set of hand-crafted features N. Voskarides et al.. Weakly-supervised contextualization of knowledge graph facts. In SIGIR 2018. 24

Improving outfit recommendation with co-supervision of fashion generation • Explain your outcome – what were you thinking? • Fashion recommendation: visual understanding and visual matching • Neural co-supervision learning framework • Incorporate supervision of generation loss: better encode aesthetic information • Introducing a novel layer-to-layer matching mechanism to fuse aesthetic information more effectively Y. Lin et al.. Improving outfit recommendation with co-supervision of fashion generation. Under review. 25

Finding influential training samples for gradient boosted decision trees • Explain your errors – which training instances are responsible for it • Influence functions framework deals with finding training points exerting the largest positive or negative influence on the model: How would the loss on x test change if x train is upweighted/downweighted? • Can be solved for parametric and non-parametric models (GDBT ensembles) B. Sharchilev et al.. Finding influential training samples for gradient boosted decision trees. In ICML 2018. 26

Retrieval as Interaction African Summer School on Machine Learning - PowerPoint PPT Presentation

Retrieval as Interaction African Summer School on Machine Learning for Data Mining and Search Maarten de Rijke January 14, 2019 University of Amsterdam derijke@uva.nl Based on joint work with Abhinav Khaitan, Ana Lucic, Anne Schuth, Boris

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

the interaction The Interaction interaction models translations between user and system

the interaction physical characteristics of interaction interaction styles the

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

getting active after SCI Traditional Email Interaction: Traditional Email Interaction:

Dining Philosophers Five philosophers sit around a table with five forks and spaghetti to eat.

CS5460: Operating Systems Lecture 11: Deadlock (Chapter 7) CS 5460: Operating Systems Dining

Visualization and Visual Analysis of Multi-faceted Scientific Data: A Survey Johannes Kehrer 1,2

Precise, cross-project code navigation at GitHub scale Douglas Creager @dcreager February

Refactoring Code Professor Larry Heimann Information Systems Carnegie Mellon University Why

with the mOS API KYOUNGSOO PARK & YOUNGGYOUN MOON ASIM JAMSHED, DONGHWI KIM, & DONGSU HAN

CC-BY-SA 2019 CC-BY-SA 2019 Josh Bicking Josh Bicking A brief history of disk filesystems

A brief introduction to vim-orgmode Takaaki ISHIKAWA 2018-11-24 Who? Name Takaaki ISHIKAWA

Retrieval as Interaction African Summer School on Machine Learning - PowerPoint PPT Presentation

Retrieval as Interaction African Summer School on Machine Learning for Data Mining and Search Maarten de Rijke January 14, 2019 University of Amsterdam derijke@uva.nl Based on joint work with Abhinav Khaitan, Ana Lucic, Anne Schuth, Boris

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Retrieval Models: Outline CS490W: Web I nformation Search &amp; Management Retrieval Models

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

the interaction The Interaction interaction models translations between user and system

the interaction physical characteristics of interaction interaction styles the

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Information Retrieval Introducing Information Retrieval and Web Search

Accessing XML content: An information retrieval perspective Mounia Lalmas mounia@acm.org 1

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

getting active after SCI Traditional Email Interaction: Traditional Email Interaction:

Dining Philosophers Five philosophers sit around a table with five forks and spaghetti to eat.

CS5460: Operating Systems Lecture 11: Deadlock (Chapter 7) CS 5460: Operating Systems Dining

Visualization and Visual Analysis of Multi-faceted Scientific Data: A Survey Johannes Kehrer 1,2

Precise, cross-project code navigation at GitHub scale Douglas Creager @dcreager February

Refactoring Code Professor Larry Heimann Information Systems Carnegie Mellon University Why

with the mOS API KYOUNGSOO PARK &amp; YOUNGGYOUN MOON ASIM JAMSHED, DONGHWI KIM, &amp; DONGSU HAN

CC-BY-SA 2019 CC-BY-SA 2019 Josh Bicking Josh Bicking A brief history of disk filesystems

A brief introduction to vim-orgmode Takaaki ISHIKAWA 2018-11-24 Who? Name Takaaki ISHIKAWA

Retrieval Models: Outline CS490W: Web I nformation Search & Management Retrieval Models

with the mOS API KYOUNGSOO PARK & YOUNGGYOUN MOON ASIM JAMSHED, DONGHWI KIM, & DONGSU HAN