search evaluation at grooveshark
play

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 - PowerPoint PPT Presentation

Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02 Traditional Evaluation: TREC Image Courtesy of TREC, http://trec.nist.gov Disadvantages of TREC-Style Evaluation Methods 1. Expensive: a. e.g., 2005 GOV2 collection i. > 45k


  1. Search Evaluation at Grooveshark Yoni Teitelbaum 2013-07-02

  2. Traditional Evaluation: TREC Image Courtesy of TREC, http://trec.nist.gov

  3. Disadvantages of TREC-Style Evaluation Methods 1. Expensive: a. e.g., 2005 GOV2 collection i. > 45k judgments 2 ii. > 25 million documents 3 2. Mostly news articles a. significantly different data set than GS songs database

  4. GS Weaknesses: Small Team, Few Resources

  5. GS Strengths: We’ve got a huge audience!

  6. A/B Testing Using Click Data A Group Sees: B Group Sees: Song 1 Song 2 Song 2 Song 3 Song 3 Song 1 Song 4 Song 4

  7. What to Measure? ● Average Rank of Click? ● Bounce Rate (% of Searches Without a Click) ● Average Amount of Time Spent on Search Page? ● Median Rank of Click? ● ...?

  8. So Which One's Better?

  9. "Gold Standard" Algorithms 4 Song 7 Song 2 Song 3 Song 5 Song 4 Song 6 Song 1 Song 8

  10. Low Power on Conventional Metrics Image courtesy of Radlinski, Kurup, and Joachims, 2008.

  11. Low Power Cont'd Image courtesy of Radlinski, Kurup, and Joachims, 2008.

  12. Interleaving Method 5 Algorithm A Algorithm B Song 1A Song 1B Song 2A Song 2B Song 3A Song 3B

  13. Interleaving Method User Sees... Song 1A Song 1B Song 2A Song 2B Song 3A Song 3B

  14. R Script to Process Results

  15. Results From Interleaving Test

  16. The Whole Stack HTML client Server (javascript) (PHP) HIVE / Hadoop Binomial Test (SQL) (R Script)

  17. References 1. Text Retrieval Conference. http://trec.nist.gov/ 2. TREC list of judgments for 2005 ad hoc query track. http://trec.nist. gov/data/terabyte/05/05.adhoc_qrels 3. University of Glasgow, Information Retrieval Group http://ir.dcs.gla.ac. uk/test_collections/gov2-summary.htm 4. F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? In Conference on Information and Knowledge Management (CIKM), 2008 . 5. T. Joachims. Evaluating retrieval performance using clickthrough data. In J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining , pages 79- 96. Physica/Springer Verlag, 2003.

Recommend


More recommend