Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong - PowerPoint PPT Presentation

Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong He 2 , Sameh Elnikety 2 , Shaolei Ren 3 1 POSTECH, 2 Microsoft Research, 3 Florida International University 1

Web Search Architecture • Billions of web documents are partitioned among many servers • Distributed system with aggregators and index serving nodes (ISNs) Aggregator TLA … … MLA MLA MLA … … … … … … … … … ISN ISN ISN ISN ISN ISN ISN ISN ISN … partition partition partition Web documents 2

Aggregation Policy • Decide how long aggregators wait for ISNs • Latency: tail latency for consistently fast responses • Quality: fraction of ISNs whose results are returned • Latency quality tradeoff • No waiting policy gives zero latency but zero quality • Wait all policy gives perfect quality but maximum latency • Our objective: reduce tail latency while meeting quality requirements 3

Challenges • Online decision • Aggregators do not know when ISNs will return their results • Different queries exhibit highly variable service demand • ISN response times vary significantly even for a single query 4

Prior Work • Wait for all • Wait by time t • Wait until quality q • Jointly consider time and quality Which query should be terminated? • Limitations • Heuristic algorithms, missing potential latency improvement • None of them cannot address multilevel aggregation 5

Summary of Contributions • Workload characterization and key intuitions • FSL: a new aggregation policy with optimality proof • Performs as well as optimal policy! • Extension to multilevel aggregation • Experimental evaluation • Microsoft Bing search and Advertisement production traces • Reduces tail latency by 36% over the best prior work 6

Intuitions • Workload characterization: three types of queries • Fast query: responses from all ISNs arrive quickly • Straggling query: most responses arrive quickly with a few stragglers • Long query: most responses take a long time • Key intuition • Complete fast & long queries for quality • Terminate straggling queries to reduce latency 7

Intuitions by Example • Goal: Minimize 95- th percentile latency with average quality ≥ 0.99 • Fast query: their completion time does not affect 95-th tail latency • Straggling query: • Miss at most 1 – 0.99 = 1% of ISN responses • Allocate 1% quality loss to straggling queries to maximize latency reduction • Long query: to minimize 95-th tail latency, < 5% long queries may respond slowly with full quality without affecting latency 8

FSL Aggregation Algorithm - for Fast, Straggling, Long queries • Single time threshold and quality threshold • Differentiate fast, straggling and long queries with proper actions • Data-driven approach • Offline processing: find best time and quality threshold using data traces • Online processing: Terminate query at time threshold if its quality is less than quality threshold • Optimality proof: FSL performs as well as the offline optimal policy 9

FSL: Key Idea • There exists a simple policy with one time threshold and one quality threshold whose tail latency is equivalent to that of any optimal policy • Example: for 100 queries, termination time of i-th query (q i ) from an optimal policy is t i , t 1 ≤ t 2 ≤…≤ t 100 , ∃ latency and quality equivalent simple policy t 95 t 1 q 1 q 1 … … t 95 t 94 q 94 q 94 same t 95 t 95 q 95 q 95 95-th tail ∞ t 96 q 96 q 96 latency … … ∞ t 100 q 100 q 100 Optimal policy Simple policy 10

FSL: Online Processing • Time threshold t* and quality threshold u* • At time t*, • If all responses are returned • Do nothing (fast query) • If quality u ≥ u* • Terminate the query (straggling query) • If quality u < u* • Run query until completion (long query) 11

FSL: Offline Processing • How to compute time threshold t* and quality threshold u*? • For each candidate time threshold, ① Assign quality 1 to long queries ② check whether it satisfies all quality requirements • Time threshold is the minimum of them who satisfies all quality requirements • Quality threshold is the lowest quality straggling query at that time # of queries # of ISNs maximum response time • Time complexity: time step size O(( rn + nlog(n))(t max /δ )) • Any given workload only requires offline processing ONCE; online decision for a query is a simple comparison incurring constant cost 12

Extension to Multilevel Aggregation • New challenges • Aggregators’ decisions on different levels are coupled • Communications between different levels of aggregators are essential to check query progress, but the amount of communication must be small TLA TLA doesn’t know quality of the current query … unless all MLAs send their progress … MLA MLA MLA For an MLA to know the quality, TLA should … … … … send back computed value to MLA … … … ISN ISN ISN ISN ISN ISN 13

FSL for Two-Level Aggregation • Known messaging times • Almost same as the single aggregator case (optimality proof is still possible!) • Bounded messaging times • Approximation error bound is derived • Unknown messaging times • Proposed heuristic (no optimality guarantee) forces all MLAs to send their partial results at the same time point 14

Experimental Setup • Workload • Single Aggregator – Microsoft Bing production traces • Two level aggregation – Microsoft Bing Ads production traces • Rich set of synthetic workloads • Algorithms in comparison • Wait all: wait responses of all ISNs • Time only: return results at time t • Quality only: return results at quality q • Kwiken [1]: jointly consider time and quality thresholds [1] V. Jalaparti, P. Bodik, S. Kandula, I. Menache, M. Rybalkin, and C. Yan. Speeding up distributed request- response workflows. In SIGCOMM ’13, 2013. 15

Experiments: Single Aggregator • Microsoft Bing search engine production traces • Latency of 44 ISNs over 66,922 queries (10,000 for training, 56,922 for test) • Goal: minimize 95- th tail latency while average quality ≥ 0.99 • FSL reduces tail latency by 53% over wait all by 36% over the best alternative 16

Experiments: Multilevel Aggregation • Microsoft Advertisement engine production traces • 1 TLA, 16 MLAs, 64 ISNs (4 per MLA). 10,000 for training, 6,311 for test • Goal: minimize 95-th tail latency while average quality ≥ 0.99 • FSL-U is within 12% of the optimal (FSL-K) Reduces tail latency by 15% over best alternative 17

Conclusion • FSL: optimal online aggregation policy • Extension to multilevel aggregation • Optimal for known messaging time between aggregators • Empirically-effective policy for unknown messaging time • Experimental evaluation • Microsoft Bing search and Advertisement production traces • Reduces tail latency by 36% over the best prior work 18

Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong - PowerPoint PPT Presentation

Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong He 2 , Sameh Elnikety 2 , Shaolei Ren 3 1 POSTECH, 2 Microsoft Research, 3 Florida International University 1 Web Search Architecture Billions of web documents are partitioned

Spatial aggregation and optimal Spatial aggregation and optimal p p gg gg g g p p reserve

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Elmwood Park: Electricity Aggregation Developing an Opt-In Municipal Aggregation Program to

simplifying the customer experience through account aggregation Sim Sangha Business Development

The Axiomatic Method in Social Choice Theory: Preference Aggregation, Judgment Aggregation, Graph

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Web CS490W: Web I nformation Search & Management Web opened the door for many important

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web

Part 16: Group Recommender Systems Rank Aggregation and Balancing Techniques Francesco Ricci

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE WEB The Web is now 26 years

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

CS 61A Discussion 10 Tail Recursion Albert Xu Slides: albertxu.xyz/teaching/cs61a/ The Cost of

Tail Recursion Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

tail bounds tail bounds For a random variable X, the tails of X are the parts of the PMF/density

The LUX-ZEPLIN Dark Matter Experiment Alden Fan for the LZ collaboration Stanford/KIPAC/SLAC

Graphical Models Graphical Models Conditional Independence 1 Steven J Zeil d-Separation 2

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed

EECS 591 D ISTRIBUTED S YSTEMS Manos Kapritsos Fall 2020 H ANDLING QUERIES query Primary The

Lazy Retirement: A Power Aware Register Management Mechanism Guillermo (Eli) Savransky WCED

Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong - PowerPoint PPT Presentation

Optimal Aggregation Policy for Web Search Jeong-Min Yun 1 , Yuxiong He 2 , Sameh Elnikety 2 , Shaolei Ren 3 1 POSTECH, 2 Microsoft Research, 3 Florida International University 1 Web Search Architecture Billions of web documents are partitioned

Spatial aggregation and optimal Spatial aggregation and optimal p p gg gg g g p p reserve

Web Services Web Services Towards Web Services Towards Web Services Towards Web Services A

Elmwood Park: Electricity Aggregation Developing an Opt-In Municipal Aggregation Program to

simplifying the customer experience through account aggregation Sim Sangha Business Development

The Axiomatic Method in Social Choice Theory: Preference Aggregation, Judgment Aggregation, Graph

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Web CS490W: Web I nformation Search &amp; Management Web opened the door for many important

Web Data Representation Web Graph, Text, Images, Metadata, Search spaces Web Search 1 The Web

Part 16: Group Recommender Systems Rank Aggregation and Balancing Techniques Francesco Ricci

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Lecture 1: Semantic Web and RDF Aidan Hogan aidhog@gmail.com THE WEB The Web is now 26 years

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

CS 61A Discussion 10 Tail Recursion Albert Xu Slides: albertxu.xyz/teaching/cs61a/ The Cost of

Tail Recursion Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of

tail bounds tail bounds For a random variable X, the tails of X are the parts of the PMF/density

The LUX-ZEPLIN Dark Matter Experiment Alden Fan for the LZ collaboration Stanford/KIPAC/SLAC

Graphical Models Graphical Models Conditional Independence 1 Steven J Zeil d-Separation 2

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed

EECS 591 D ISTRIBUTED S YSTEMS Manos Kapritsos Fall 2020 H ANDLING QUERIES query Primary The

Lazy Retirement: A Power Aware Register Management Mechanism Guillermo (Eli) Savransky WCED

Web CS490W: Web I nformation Search & Management Web opened the door for many important