Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019 - PowerPoint PPT Presentation

Lucian Precup Radu Pop Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019

// Poll • How many of you are using facets with the search engines you implement ? • Who is doing statistics on facet usage ? • Who is using Solr ? • Who is using Elasticsearch ? • Other search technology ? • Who speaks French ? @a2lean #haystackconf

// Why this talk ? @a2lean #haystackconf

Used to define filters that refine the initial query Used for disambiguation // Facets ? Give a holistic view over the search results Allow to find the needle in the haystack more quickly @a2lean #haystackconf

@a2lean #haystackconf

Hierarchical facets

Other “exotic” facets @a2lean #haystackconf

Facets on a mobile device @a2lean #haystackconf

// Facets and Filters Filters Facets @a2lean #haystackconf

// Why are facets important ? • More and more data and less and less space to VOICE ONLY VOICE + SCREEN (multimodal) display it • New ways of searching: voice, assistants, chat bots @a2lean #haystackconf

Facets are a standard feature of modern search engines. Apache Lucene has great support for // How are everything around facets facets • Solr : field value faceting, range faceting, pivot faceting, interval faceting, block join faceting, … implemented ? • Elasticsearch : aggregations, sub-aggregations, top hits aggregation, histogram aggregation, range aggregations, geo aggregations, … The User Experience with facets and the way they are "displayed" can be very diverse @a2lean #haystackconf

// Structure of the talk • Examples of facet implementations • Challenges with facets and possible solutions • Challenges with search in general and how facets can help • Technical implementation examples are with Elasticsearch • We are addressing less the "graphical" display of facets and more the technical issues with their relevancy @a2lean #haystackconf

// Challenge #1: marketplaces • Issue: the heterogeneity of results and the number of candidate facets @a2lean #haystackconf

Facets based on top N results: • Fetch the top N results (first page + a few of the next ones) • Retain only the facets applicable to Heterogeneity these top N results of results: Implementation details: Solution 1 • First query: query term • Fetch the first N document ids (let’s say max 1024) • Second query : terms filter on document ids and aggregations @a2lean #haystackconf

Heterogeneity of results: Solution 2 • Modeling with a single facet-name / facet-value field tuple and the nested type • Need to treat differently strings, numbers and booleans @a2lean #haystackconf

Heterogeneity of results: Solution 2 – the query @a2lean #haystackconf

// Challenge #2: auto- completion @a2lean #haystackconf

Auto-completion: solution Products index Suggestions index Use the Update API here and also increase the number of occurrences @a2lean #haystackconf

Auto-completion: solution The "Suggestions" index The query @a2lean #haystackconf

Auto-completion: solution The result @a2lean #haystackconf

Auto-completion: solution The shortcut @a2lean #haystackconf

// Challenge #3: assistants • Often the first responses of an assistant are suggestions for additional filters that refine the query. @a2lean #haystackconf

How to narrow ? • Often the first responses of an assistant are suggestions for additional filters that refine the query • “Quick win” solution : - Filters • Issue : - Which facets to choose? • Prerequisite: - Your search engine should already have relevant filters @a2lean #haystackconf

// Challenge #4: relevant facet values • Issue: how to make facet values relevant in the context of many "less relevant" results ?

@a2lean #haystackconf

Solutions: work on your search precision Analytics and data science have clues: for instance, when clients type “tomato”, is there a category which regroup most of the clicks ? Relevant facet All you must do is prefilter some facets (or even all the results) values: the with this category : 80% of the result set will disappear and your filters will look good ! solution Examples of prefiltering at Carrefour: •11% of results for “tomatos” are in the “Fresh vegetables” category but they represent 86% of products added to basket •24% of results for “rice” are in the “Pasta and Rice” category and represent 90% of purchases •8% of results for “sugar” are in the “Sugar and sweeteners” category and represent 90% of purchases @a2lean #haystackconf

// Challenge #5: search in facet values • Issue: How to bring up facet values beyond the first top N values ? • Solutions: • Pagination • Search in Search @a2lean #haystackconf

Search in facet values: implementation with Elasticsearch @a2lean #haystackconf

Search in facet values: details of the filter aggregation @a2lean #haystackconf

Search in facet values: details of the terms sub- aggregation @a2lean #haystackconf

Search in facet values: details of the top_hits sub-aggregation and highlighting @a2lean #haystackconf

// Challenge #6: unstructured data • Issue: the lack of structure makes difficult to suggest additional query refinements • Solutions: • Clustering (like http://project.carrot2.org/) • Entity extraction (like https://www.basistech.com/t ext-analytics/rosette/entity- extractor/ or https://twitter.com/dep4b/st atus/1121141764503609345) @a2lean #haystackconf

Display “facets" with clustering http://project.carrot2.org/ @a2lean #haystackconf

Enrich the data with entity extraction Haystack is the conference for improving search Conference: Haystack relevance. If you're like us, you work to understand the shiny new tools or dense academic papers out there that Domain: search promise the moon. Then you puzzle how to apply those insights to your search problem, in your search stack. But the path isn't always easy, and the promised gains don't always materialize. Haystack is the conference for organizations where search, matching, and relevance really matters to the bottom line. For search managers, developers, relevance engineers & data scientists finding ways to innovate, see past the silver bullets, and share what actually has worked well for their unique problems. Please come share and learn! https://haystackconf.com/ @a2lean #haystackconf

Facets on unstructured text after entity extraction

More data, less space  Facets are more and more important // Conclusions In order to be useful  Facets and should be relevant takeaways Modern search engines have great support for facets @a2lean #haystackconf

When too many possible facets  the relevant ones should be driven by Marketplaces the most relevant results Auto- Use facet values as suggestions and disambiguation techniques completion // Conclusions When too many results  chose the facet and filter suggestions that Assistants disambiguate most as the first answer and Relevant When there is a risk of noise in the results  avoid bringing it to facet takeaways facet values values Search in When too many facet values  bring up those beyond the top N with facet values search (not with JavaScript Unstructured Use clustering and entity extraction to be able to define facets data @a2lean #haystackconf

Thank You ! • Lucian Precup • Radu Pop • @lucianprecup • @a2lean • #haystackconf • @o19s • Berlin EU 2019

Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019 - PowerPoint PPT Presentation

Lucian Precup Radu Pop Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019 // Poll How many of you are using facets with the search engines you implement ? Who is doing statistics on facet usage ? Who is using Solr

Models of Language Evolution Iterated learning Michael Franke Facets of EvoLang

PyNN and the FACETS Hardware Daniel Brderle Heidelberg FACETS Hardware: Recap

An Effective Model of Facets Formation Dima Ioffe 1 Technion April 2015 1 Based on joint works

Formation of facets in an equilibrium model of surface growth Dima Ioffe 1 Technion December

Four facets of good open source libraries Bay Scala, 28 April 2017 haoyi.sg@gmail.com Agenda

NEST Progress Report Jochen Martin Eppler <eppler@biologie.uni-freiburg.de> FACETS CodeJam

The facets layer IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2 Rick Scavetta Founder,

FLEXIBLE MODELLING BASED ON FACETS Juan de Lara 1 Joint work with E. Guerra 1 , J. Kienzle 2 , Y.

Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz Shixia Liu,

In thi s Fast-paced World, facts and facets of our lives keep changing with regularity .

YOUR TEAMBUILDING RIDDLEWALK DISCOVERY A walk to discover a district in all its facets: history,

2019 Pre-conference tour of Scotland Discover the varied facets of Scotland that make it so

Digital preservation with libsafe technical facets July, 2014 Paseo de la Castellana, 153

The Four Facets of Customer Engagement Communications Solutions Integrator Corporate Overview

To what extent do facets of the learning environment influence apprentices motivation and

So%ware Architecture Beyond the Blueprints Aligning So%ware Architecture with the facets of

Effectiveness of Career Development? Ask a Precise Question if You Want a Precise Answer Peter

The Bottom Line Conference Vancouver 12 March 2019 Turning Things Upside Down to Save Lives in

Quantum Probability and The Problem of Pattern Recognition Federico Holik 4/11/2016 - Cagliari

Self-Reflection What is the #1 most important element to you in a professional development

Worker wellbeing Sianne Hodge, Program Manager, NADA 1 Acknowledgement of Country I proudly

Project Overview & Planning Steve Peggs Project Manager peggs@bnl.gov Directors review,

closing remarks After the conference... is before the conference: Mark in your agendas:

SLIDES Lesson 1: Phytoplankton Microscopy Lab 25 25 26 26 27 27 28 28 29 29 30 30 31 SLIDES Lesson

Sambuz

Useful Links

Newsletter

Mail Us

Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019 - PowerPoint PPT Presentation

Lucian Precup Radu Pop Relevant Facets @lucianprecup @a2lean #haystackconf Berlin EU 2019 // Poll How many of you are using facets with the search engines you implement ? Who is doing statistics on facet usage ? Who is using Solr

Models of Language Evolution Iterated learning Michael Franke Facets of EvoLang

PyNN and the FACETS Hardware Daniel Brderle Heidelberg FACETS Hardware: Recap

An Effective Model of Facets Formation Dima Ioffe 1 Technion April 2015 1 Based on joint works

Formation of facets in an equilibrium model of surface growth Dima Ioffe 1 Technion December

Four facets of good open source libraries Bay Scala, 28 April 2017 haoyi.sg@gmail.com Agenda

NEST Progress Report Jochen Martin Eppler &lt;eppler@biologie.uni-freiburg.de&gt; FACETS CodeJam

The facets layer IN TERMEDIATE DATA VIS UALIZ ATION W ITH GGP LOT2 Rick Scavetta Founder,

FLEXIBLE MODELLING BASED ON FACETS Juan de Lara 1 Joint work with E. Guerra 1 , J. Kienzle 2 , Y.

Visualization for Rich Text Corpora Nan Cao, Jimeng Sun, Yu-Ru Lin, David Gotz Shixia Liu,

In thi s Fast-paced World, facts and facets of our lives keep changing with regularity .

YOUR TEAMBUILDING RIDDLEWALK DISCOVERY A walk to discover a district in all its facets: history,

2019 Pre-conference tour of Scotland Discover the varied facets of Scotland that make it so

Digital preservation with libsafe technical facets July, 2014 Paseo de la Castellana, 153

The Four Facets of Customer Engagement Communications Solutions Integrator Corporate Overview

To what extent do facets of the learning environment influence apprentices motivation and

So%ware Architecture Beyond the Blueprints Aligning So%ware Architecture with the facets of

Effectiveness of Career Development? Ask a Precise Question if You Want a Precise Answer Peter

The Bottom Line Conference Vancouver 12 March 2019 Turning Things Upside Down to Save Lives in

Quantum Probability and The Problem of Pattern Recognition Federico Holik 4/11/2016 - Cagliari

Self-Reflection What is the #1 most important element to you in a professional development

Worker wellbeing Sianne Hodge, Program Manager, NADA 1 Acknowledgement of Country I proudly

Project Overview &amp; Planning Steve Peggs Project Manager peggs@bnl.gov Directors review,

closing remarks After the conference... is before the conference: Mark in your agendas:

SLIDES Lesson 1: Phytoplankton Microscopy Lab 25 25 26 26 27 27 28 28 29 29 30 30 31 SLIDES Lesson

Sambuz

Useful Links

Newsletter

Mail Us

NEST Progress Report Jochen Martin Eppler <eppler@biologie.uni-freiburg.de> FACETS CodeJam

Project Overview & Planning Steve Peggs Project Manager peggs@bnl.gov Directors review,