a distributed algorithm for large scale generalized
play

A Distributed Algorithm for Large-Scale Generalized Matching Faraz - PowerPoint PPT Presentation

A Distributed Algorithm for Large-Scale Generalized Matching Faraz Makari, Baruch Awerbuch, Rainer Gemulla, Rohit Khandekar, Julin Mestre, Mauro Sozio Recommender systems Given: A user-item feedback matrix Goal: Recommend additional


  1. A Distributed Algorithm for Large-Scale Generalized Matching Faraz Makari, Baruch Awerbuch, Rainer Gemulla, Rohit Khandekar, Julián Mestre, Mauro Sozio

  2. Recommender systems Given: A user-item feedback matrix  Goal: Recommend additional  ? ? 2 items users may like ? ? ? 4 ? ? ? 5 ? 1

  3. Recommender systems Given: A user-item feedback matrix  Goal: Recommend additional  4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings 4 4 2 3 5 2 1

  4. Recommender systems Given: A user-item feedback matrix  Goal: Recommend additional  4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 1

  5. Recommender systems Given: A user-item feedback matrix  Goal: Recommend additional  4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 1

  6. Recommender systems Given: A user-item feedback matrix  Goal: Recommend additional  4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 How to recommend items under constraints? 1

  7. Generalized Bipartite Matching (GBM) # Recommendation = 1 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Constraints:  Neither too few nor too many recommendations  Number of DVDs limited  … 2

  8. Generalized Bipartite Matching (GBM) # Recommendation = 1 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Constraints:  Neither too few nor too many recommendations  Number of DVDs limited  … 2

  9. Generalized Bipartite Matching (GBM) 1 ≤ # Recommendations ≤ 2 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Goal  Recommend items to users s.t. Interesting for users I. Maximum weight matching s.t. Neither too few nor too many II. LB and UB constraints III. Availability of items satisfied 3

  10. Generalized Bipartite Matching (GBM) 1 ≤ # Recommendations ≤ 2 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Goal  Recommend items to users s.t. Interesting for users I. Maximum weight matching s.t. Neither too few nor too many II. LB and UB constraints III. Availability of items satisfied 3

  11. Challenge GBM optimally solvable in polynomial time  E.g., linear programming  Available solvers handle small instances very well 4

  12. Challenge GBM optimally solvable in polynomial time  E.g., linear programming  Available solvers handle small instances very well Real applications can be large  E.g., Netflix has >20M users, >20k movies  Available solvers do not scale to large problems 4

  13. Challenge GBM optimally solvable in polynomial time  E.g., linear programming  Available solvers handle small instances very well Real applications can be large  E.g., Netflix has >20M users, >20k movies  Available solvers do not scale to large problems Goal:  Efficient and scalable algorithm for large-scale GBM instances 4

  14. Framework Phase 1: Approximate LP 2  Compute “edge probabilities’’ 0.4 0.4 0.9 0.5 0.7 using linear programming 0.7 0.6 0.3 0.3 Phase 1 5

  15. Framework Phase 1: Approximate LP 2  Compute “edge probabilities’’ 0.4 0.4 0.9 0.5 0.7 using linear programming 0.7 0.6 0.3 0.3 Phase 1 Phase 2: Round  Select edges based on probabilities from phase 1 1 1 1 1 0 1 1 0 0 Phase 2 5

  16. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP)  Gradient-based multiplicative weights update algorithm  Approximately solves MPC LPs ( ε : approx. parameter)  LB and UB constraints satisfied up to (1± ε ) Almost feasible  Objective value (1- ε ) of the optimum Near-optimal  Poly-log rounds  Easy to implement: Each round involves matrix-vector multiplications 6

  17. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 0.8 0.5 0.6 0.1 0.2 0.9 0.5 0.2 0.2 0 ≤ deg ≤ 2 7

  18. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.3 0.9 0.4 1.3 0.8 0.5 0.6 0.1 0.2 0.9 0.5 0.2 0.2 0 ≤ deg ≤ 2 2.3 0.9 0.8 8

  19. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.2 1.5 0.7 1.4 0.7 0.6 0.8 0.4 0.5 0.6 0.7 0.3 0.2 0 ≤ deg ≤ 2 2.1 1.4 1.3 9

  20. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 0 ≤ deg ≤ 2 2.0 1.4 1.4 10

  21. Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 0 ≤ deg ≤ 2 2.0 1.4 1.4 Contrib. 2 Distributed Processing for GBM (details in paper)  Communication depends on # nodes not # edges  All computation in parallel 10

  22. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11

  23. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Naïve approach: Round independently using prob. from phase 1 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11

  24. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) ✖ I. Approx. guarantee preserved (in expectation) ✔ II. Naïve approach: Round independently using prob. from phase 1 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11

  25. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 12

  26. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 13

  27. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Cycle Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 2. Modify edge prob. on the 0.4 0.4 0.9 0.5 0.7 cycle/maximal path (details omitted) 0.7 0.6 0.3 0.3  If edge prob. = 1, include in solution  If edge prob. = 0, remove from graph 2.0 1.4 1.4 14

  28. Phase 2: Selecting edges  Given: Edge probabilities from phase 1  Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Cycle Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 2. Modify edge prob. on the 0.3 0.4 1 0.4 0.7 cycle/maximal path (details omitted) 0.7 0.7 0.3 0.3  If edge prob. = 1, include in solution  If edge prob. = 0, remove from graph 2.0 1.4 1.4 15

Recommend


More recommend