reduce and aggregate similarity ranking in multi
play

Reduce and Aggregate: Similarity Ranking in Multi-Categorical - PowerPoint PPT Presentation

Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto J. Feldman*, S. Lattanzi*, S. Leonardi, V. Mirrokni*. *Google Research Sapienza U. Rome Motivation Recommendation Systems: Bipartite


  1. Reduce and Aggregate: Similarity Ranking in Multi-Categorical Bipartite Graphs Alessandro Epasto J. Feldman*, S. Lattanzi*, S. Leonardi°, V. Mirrokni*. *Google Research °Sapienza U. Rome

  2. Motivation ● Recommendation Systems: ● Bipartite graphs with Users and Items. ● Identify similar users and suggest relevant items. ● Concrete example: The AdWords case. ● Two key observations: ● Items belong to different categories. ● Graphs are often lopsided.

  3. Modeling the Data as a Bipartite Graph 2$ Retailers Hundreds of Labels Nike Store 3$ New York 4$ Apparel 1$ Soccer Shoes 5$ Sport 2$ Equipment Soccer Ball Millions of Advertisers Billions of Queries

  4. Personalized PageRank For a node v (the seed) and a probability alpha v u The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v.

  5. The Problem 2$ Retailers Hundreds of Labels Nike Store 3$ New York 4$ Apparel 1$ Soccer Shoes 5$ Sport 2$ Equipment Soccer Ball Millions of Advertisers Billions of Queries

  6. Other Applications ● General approach applicable to several contexts: ● User , Movies , Genres : find similar users and suggest movies. ● Authors , Papers , Conferences : find related authors and suggest papers to read.

  7. Semi-Formal Problem Definition Advertisers Queries

  8. Semi-Formal Problem Definition Advertisers A Queries

  9. Semi-Formal Problem Definition Advertisers A Queries Labels:

  10. Semi-Formal Problem Definition Advertisers A Queries Goal: Find the nodes most Labels: “similar” to A.

  11. How to Define Similarity? ● We address the computation of several node similarity measures: ● Neighborhood based: Common neighbors, Jaccard Coefficient, Adamic-Adar. ● Paths based: Katz. ● Random Walk based: Personalized PageRank. ● Experimental question: which measure is useful? ● Algorithmic questions: ● Can it scale to huge graphs? ● Can we compute it in real-time ?

  12. Our Contribution ● Reduce and Aggregate: general approach to induce real-time similarity rankings in multi- categorical bipartite graphs, that we apply to several similarity measures. ● Theoretical guarantees for the precision of the algorithms. ● Experimental evaluation with real world data.

  13. Personalized PageRank For a node v (the seed) and a probability alpha v u The stationary distribution assigns a similarity score to each node in the graph w.r.t. node v.

  14. Challenges ● Our graphs are too big ( billions of nodes) even for very large-scale MapReduce systems. ● MapReduce is not real-time. ● We cannot pre-compute the rankings for each subset of labels.

  15. Reduce and Aggregate 1) a b a a 2) b a c c 3) b c b c Reduce: Given the bipartite and a category construct a graph with only A nodes that preserves the ranking on the entire graph. Aggregate: Given a node v in A and the reduced graphs of the subset of categories interested determine the ranking for v.

  16. Reduce (Precomputation) Advertisers Queries

  17. Reduce (Precomputation) Advertisers Queries Precomputed Rankings

  18. Reduce (Precomputation) Advertisers Queries Precomputed Precomputed Rankings Rankings

  19. Reduce (Precomputation) Advertisers Queries Precomputed Precomputed Precomputed Rankings Rankings Rankings

  20. Aggregate (Run Time) A Precomputed Precomputed Ranking of Rankings Rankings Red + Yellow

  21. Reduce for Personalized PageRank Side A X Y X Y Side A Side B ● Markov Chain state aggregation theory (Simon and Ado, ’61; Meyer ’89, etc.). ● 750x reduction in the number of node while preserving correctly the PPR distribution on the entire graph .

  22. Run-time Aggregation

  23. Koury et al. Aggregation-Disaggregation Algorithm B A Step 1: Partition the Markov chain into DISJOINT subsets

  24. Koury et al. Aggregation-Disaggregation Algorithm B A π B π A Step 2: Approximate the stationary distribution on each subset independently.

  25. Koury et al. Aggregation-Disaggregation Algorithm P AA B A P AB π B π A P BA P BB Step 3: Consider the transition between subsets.

  26. Koury et al. Aggregation-Disaggregation Algorithm P AA B A P AB π 0 π 0 B A P BA P BB Step 4: Aggregate the distributions. Repeat until convergence.

  27. Aggregation in PPR A X Y π A Precompute the stationary distributions individually

  28. Aggregation in PPR B X Y π B Precompute the stationary distributions individually

  29. Aggregation in PPR A B The two subsets are not disjoint!

  30. Our Approach π A π B Y X Y X ● The algorithm is based only on the reduced graphs with Advertiser-Side nodes. ● The aggregation algorithm is scalable and converges to the correct distribution.

  31. Experimental Evaluation ● We experimented with publicly available and proprietary datasets: ● Query-Ads graph from Google AdWords > 1.5 billions nodes, > 5 billions edges. ● DBLP Author-Papers and Patent Inventor- Inventions graphs. ● Ground-Truth clusters of competitors in Google AdWords.

  32. Patent Graph Precision vs Recall 1 Inter Jaccard Adamic-Adar 0.9 Katz PPR 0.8 0.7 0.6 Precision Precision 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Recall

  33. Google AdWords Precision Recall

  34. Conclusions and Future Work ● It is possible to compute several similarity scores on very large bipartite graphs in real-time with g ood accuracy. ● Future work could focus on the case where categories are not disjoint is relevant.

  35. Thank you for your attention

  36. Reduction to the Query Side X Y π B π A

  37. Reduction to the Query Side X Y π B π A This is the larger side of the graph.

  38. Convergence after One Iteration Kendall-T au Correlation DBLP Patent 1 Query-Ads (cost) 0.8 au Kendall-T 0.6 0.4 0.2 0 10 20 30 40 50 All Position (k)

  39. Convergence Approximation Error vs # Iterations DBLP (1 - Cosine) Patent (1 - Cosine) 1-Cosine Similarity 0.001 0.0001 1-Cosine 1e-05 1e-06 0 2 4 6 8 10 12 14 16 18 20 Iterations Iterations

Recommend


More recommend