Expediting Search Trend Detection via Prediction of Query Counts Nadav Golbandi Liran Katzir* Yehuda Koren* Ronny Lempel Yahoo! Labs, Haifa, Microsoft Research, Google, Haifa, Israel Yahoo! Labs, Haifa, Israel ATL-Israel Israel International Conference on Web Search and Data Mining, February 6 th - 8 th , 2013 Rome, Italy
Search Trends Amanda Peet Hybrid Cars 1 1 0.8 0.8 #queries #queries 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 12 24 36 48 60 0 12 24 36 48 60 Time in Hours Time in Hours Search queries with a sudden popularity surge
Trending Topics Microsoft MSN biggest movers Yahoo! Trending Now Google Trends
Prior-art Detection Algorithm Create a language model � � at time � . 1. 2. Estimate trendiness score ����� �, � = log � �|� � � −�max log � �|� ��� � . � 3. Output top-k queries. lowers the score of periodic queries A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. A. Dong, Y. Chang, Z. Zheng, G. Mishne, J. Bai, R. Zhang, K. Buchner, C. Liao, and F. Diaz. Towards recency ranking in web search . In WSDM, pages 11-20, 2010. Diaz. Towards recency ranking in web search . In WSDM, pages 11-20, 2010.
Proposed Method: query counts trending searches during for time [t-1,t) t Prior-art Algorithm New Algorithm query trending predicted counts searches counts during for time for [t-1,t) t+1 [t,t+1) Prediction Prior-art Procedure Algorithm
Goal: Keep precision and recall, reduce detection time ! query counts trending searches during for time [t-1,t) t Prior-art Algorithm New Algorithm query trending predicted counts searches counts during for time for [t-1,t) t+1 [t,t+1) Prediction Prior-art Procedure Algorithm
Intuition – counts time series Query q’s count � ! (�) � �� (�) � � (�) � � (�) … time 1m 1m 1m 1m
Intuition – prediction input Query q’s count � ! (�) � �� (�) � � (�) � � (�) … time #(�, 30)
Intuition – prediction target Query q’s count � ! (�) What will be the query volume during following hour? � �� (�) … � � (�) � � (�) … time #(�, 30) 31, 32 … 90
Intuition – prediction target value Query q’s count � ! (�) What will be the query volume during following hour? � �� (�) … � � (�) & �, 30 = � � (�) � � � + � � � + ⋯ + � �! � … time #(�, 30) 31, 32 … 90
Explanatory and Response variables � � � , � � � , … , � ! � �������� � � + � � � + ⋯ + � �! � #(�, 30) &(�, 30) #(�, 31) &(�, 31) #(�, 32) &(�, 32) … #(�, �) &(�, �) ? - . #(�, �′) & *�(�, �′)
Explanatory and Response variables � � � , � � � , … , � ! � �������� � � + � � � + ⋯ + � �! � #(�, 30) &(�, 30) Training Phase #(�, 31) &(�, 31) (only on #(�, 32) &(�, 32) … trends) #(�, �) &(�, �) Runtime ? Phase - . #(�, �′) & *�(�, �′) (on all queries)
Prediction Models Auto-regressive Time-smoothed, single model This method fits a vector - such that - . .
Prediction Models Auto-regressive Time-smoothed, per-hour model This method fits a vector -(ℎ)� for every hour ℎ -(ℎ)� ℎ . . such that
Prediction Models Auto-regressive Time-smoothed, per-hour model This method fits a vector -(ℎ)� for every hour ℎ -(ℎ)� ℎ . . such that 72 70 % of good 68 predictions Per-Hour Model 66 64 Single Model 62 60 0 15 30 45 60 75 90 Length of #(�, �)
Example of a specific hour model 10 - . (18:00-19:00) 8 Coefficient 6 4 2 0 -30 -20 -10 0 Index
Proposed Method query counts trending searches during for time [t-1,t) t Prior-art Algorithm New Algorithm query trending predicted counts searches counts during for time for [t-1,t) t+1 [t,t+1) Prediction Prior-art Procedure Algorithm
Experimental Precision and Recall Recall Precision 1.05 1.05 %improvement %improvement 1.03 1.03 1.01 1.01 0.99 0.99 0.97 0.97 0.95 0.95 0 10 20 0 10 20 Number of search trends Number of search trends
Experimental detection time Detection time 30 Average reduction in 20 detection time 10 0 0 5 10 15 20 Number of search trends
An alternative perspective Building a language model in a temporal corpus involves a trade off. Taking a long history creates bias . Taking a short history creates variance . The new method can be viewed as a time- sensitive language model .
Conclusion/Discussion • We presented a new scheme for building a language model as a building block in a temporal dynamic environment. • The new scheme relatively maintains system performance (precision and recall), with reduction in detection time.
Thanks
Recommend
More recommend