Searching with Context Reiner Kraft Farzin Maghoul Chi Chao Chang Ravi Kumar Yahoo!, Inc., Sunnyvale, CA 94089, USA
Agenda • Motivation • Contextual Search – Introduction – Case Study: Y!Q – Algorithms • Query Rewriting • Rank-Biasing • Iterative, Filtering Meta-search (IFM) • Evaluation and Results • Conclusion Yahoo! Confidential 2
Motivation • Traditional web search based on keywords as good as it gets? – Not too much qualitative differences between search results of major search engines – Introducing anchor text and link analysis to improve search relevancy last major significant feature (1998) • Search can be vastly improved in the dimension of precision • The more we know about a user’s information need, the more precise our results can be • There exists a lot of evidence (context) beyond the terms in the query box from which we can infer better knowledge of information need • Study of web query logs show that users are already employing a manual form of contextual search by using additional terms to refine and reissue queries when the search results for the initial query turn out to be unsatisfactory • => How can we automatically use context for augmenting, refining, and improving a user’s search query to obtain more relevant results? Yahoo! Confidential 3
Contextual Search - General Problems • Gathering evidence (context) • Representing and inferring user information need from evidence • Using that representation to get more precise results Yahoo! Confidential 4
Contextual Search - Terminology • Context – In general: Any additional information associated with a query – More narrow: A piece of text (e.g., a few words, a sentence, a paragraph, an article) that has been authored by someone • Context Term Vector – Dense representation of a context in the vector space model – Obtained using keyword extraction algorithms (e.g., Wen-tau Yih et al., KEA, Y! Content Analysis) • Search Query Types – Simple : Few keywords, no special or expensive operators – Complex : Keywords/phrases plus special ranking operators, more expensive to evaluate – Contextual : Query + context term vector • Search Engine Types – Standard : Web search engines (e.g., Yahoo, Google, MSN, …) that support simple queries – Modified : A Web search engine that has been modified to support complex search queries Yahoo! Confidential 5
Case Study: Y!Q Contextual Search • Acquiring context: – Y!Q provides a simple API that allows publishers to associate visual information widgets (actuators) to parts of page content (http://yq.search.yahoo.com/publisher/embed.html) – Y!Q lets users manually specify or select context (e.g., within Y! Toolbar, Y! Messenger, included JavaScript library) • Contextual Search Application – Generates a digest (context term vector) of the associated content piece as additional terms of interest for augmenting queries (content analysis) – Knows how to perform contextual searches for different search back-end providers (query rewriting framework) – Knows how to rank results based on query + context (contextual ranking) – Seamless integration by displaying results in overlay or embedded within page without interrupting the user’s workflow Yahoo! Confidential 6
Example Yahoo! Confidential 7
Y!Q Actuator Example Yahoo! Confidential 8
Example Y!Q Overlay showing contextual search results Yahoo! Confidential 9
Example Y!Q: Searching in Context Yahoo! Confidential 10
Example CSRP Terms extracted from context Yahoo! Confidential 11
Y!Q System Architecture Yahoo! Confidential 12
Implementing Contextual Search • Assumption: – We have a query plus a context term vector (contextual search query) • Design dimensions: – Number of queries to send to a search engine per contextual search query – Types of queries to send • Simple • Complex • Algorithms: – Query Rewriting (QR) – Rank-Biasing (RB) – Iterative, Filtering, Meta-Search (IFM) Yahoo! Confidential 13
Yahoo! Confidential 14
Algorithm 1: Query Rewriting • Combine query + context term vector using AND/OR semantics • Input Parameters: – Query, context term vector – Number of terms to consider from context term vector • Experimental Setup: – QR1 (takes top term only) – QR2 (takes top two terms only) – … up to QR5 • Example: � ( ) c = a b c d – QR3 : Given query q and => q AND a AND b AND c • Pros: – Simplicity, supported in all major search engines • Cons: – Possibly low recall for longer queries Yahoo! Confidential 15
Algorithm 2: Rank-Biasing • Requires modified search engine with support for RANK operator for rank-biasing • Complex query comprises: – Selection part – Optional ranking terms are only impacting score of selected documents • Input Parameters: – Query, context term vector – Number of selection terms to consider (conjunctive semantics) – Number of RANK operators – Weight multiplier for each RANK operator (used for scaling) • Experimental Setup: RB2 (uses 1 selection term, 2 RANK operators, weight multiplier=0.1) – RB6 (uses 2 selection terms, 6 RANK operators, weight multiplier=0.01) – ⎛ ⎞ a ,50 • Example: ⎜ ⎟ � c = b ,25 ⎜ ⎟ RB2: Given q and => q AND a RANK( b, 2.5) RANK( c, 1.2) – ⎜ ⎟ • ⎝ ⎠ c ,12 Pros: – Ranking terms do not limit recall • Cons: – Requires a modified search engine back-end, more expensive to evaluate Yahoo! Confidential 16
Algorithm 3: IFM • IFM based on concept of Meta-search (e.g., used in Buying Guide Finder [kraft, stata, 2003]) – Sends multiple (simple) queries to possibly multiple search engines – Combines results using rank aggregation methodologies Yahoo! Confidential 17
IFM Query Generation • Uses “query templates” approach: – Query templates specify how sub-queries get constructed from the pool of candidate terms – Allow to explore the problem domain in a systematic way – Implemented primarily sliding window technique using query templates � ( ) c = a b c d – Example: Given query q and => a sliding window query template of size 2 may construct the following queries: • q a b • q b c • q c d • Parameters: – Size of the sliding window • Experimental Setup: – IFM-SW1, IFM-SW2, IFM-SW3, IFM-SW4 Yahoo! Confidential 18
IFM uses Rank Aggregation for combining different result sets • Rank aggregation represents a robust and principled approach of combining several ranked lists into a single ranked list Given universe U , and k ranked lists π 1 , …, π k on the elements • of the universe k ∑ d ( π * , π i ) – Combine k lists into π *, such that i = 1 is minimized – For d(.,.) we used various distance functions (e.g,. Spearman footrule, Kendall tau) • Parameters: – Style of rank aggregation: • Rank averaging (adaptation of Borda voting method) • MC4 (based on Markov chains,more computationally expensive) • Experimental Setup: – IFM-RA, IFM-MC4 Yahoo! Confidential 19
Experimental Setup and Methodology • Benchmark – 200 contexts sampled from Y!Q query logs • Tested 41 configurations – 15 QR (Yahoo, MSN, Google) – 18 RB (1 or 2 selection terms; 2, 4, or 6 RANK operators, 0.01, 0.1, or 0.5 weight multipliers) – 8 IFM (avg and MC4 on Yahoo, SW1 to SW4) • Per item test – Relevancy to the context, perceived relevancy used – Relevancy Judgments: • Yes • Somewhat • No • Can’t Tell – 28 expert judges, look at top 3 results, total of 24,556 judgments Yahoo! Confidential 20
Example • Context: – “Cowboys Cut Carter; Testaverde to Start OXNARD, Calif Quincy Carter was cut by the Dallas Cowboys on Wednesday, leaving 40-year-old Vinny Testaverde as the starting quarterback. The team would’nt say why it released Carter.” • Judgment Examples: – A result directly relating to the “Dallas Coyboys” (football team) or Quincy Carter => Yes – A result repeating the same or similar information => Somewhat – A result about Jimmy Carter, the former U.S. president => No – If result doesn’t provide sufficient information => Can’t tell Yahoo! Confidential 21
Metrics • Strong Precision at 1 (SP@1) and 3 (SP@3) – Number of relevant results divided by the number of retrieved results, but capped at 1 or 3, and expressed as a ratio – A result is considered relevant if and only if it receives a ‘Y’ relevant judgment • Precision at 1 (P@1) and 3 (P@3) – Number of relevant results divided by the number of retrieved results, but capped at 1 or 3, and expressed as a ratio – A result is considered relevant if and only if it receives a ‘Y’ or ‘S’ relevant judgment Yahoo! Confidential 22
Recommend
More recommend