A Distributed Algorithm for Large-Scale Generalized Matching Faraz Makari, Baruch Awerbuch, Rainer Gemulla, Rohit Khandekar, Julián Mestre, Mauro Sozio
Recommender systems Given: A user-item feedback matrix Goal: Recommend additional ? ? 2 items users may like ? ? ? 4 ? ? ? 5 ? 1
Recommender systems Given: A user-item feedback matrix Goal: Recommend additional 4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings 4 4 2 3 5 2 1
Recommender systems Given: A user-item feedback matrix Goal: Recommend additional 4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 1
Recommender systems Given: A user-item feedback matrix Goal: Recommend additional 4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 1
Recommender systems Given: A user-item feedback matrix Goal: Recommend additional 4 1 2 items users may like Approach: 5 1 3 I. Predict missing ratings II. Recommend items with 4 4 2 highest predicted ratings 3 5 2 How to recommend items under constraints? 1
Generalized Bipartite Matching (GBM) # Recommendation = 1 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Constraints: Neither too few nor too many recommendations Number of DVDs limited … 2
Generalized Bipartite Matching (GBM) # Recommendation = 1 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Constraints: Neither too few nor too many recommendations Number of DVDs limited … 2
Generalized Bipartite Matching (GBM) 1 ≤ # Recommendations ≤ 2 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Goal Recommend items to users s.t. Interesting for users I. Maximum weight matching s.t. Neither too few nor too many II. LB and UB constraints III. Availability of items satisfied 3
Generalized Bipartite Matching (GBM) 1 ≤ # Recommendations ≤ 2 Users 1 5 3 4 2 3 2 Predicted ratings 4 1 DVDs Available DVD per movie = 2 Goal Recommend items to users s.t. Interesting for users I. Maximum weight matching s.t. Neither too few nor too many II. LB and UB constraints III. Availability of items satisfied 3
Challenge GBM optimally solvable in polynomial time E.g., linear programming Available solvers handle small instances very well 4
Challenge GBM optimally solvable in polynomial time E.g., linear programming Available solvers handle small instances very well Real applications can be large E.g., Netflix has >20M users, >20k movies Available solvers do not scale to large problems 4
Challenge GBM optimally solvable in polynomial time E.g., linear programming Available solvers handle small instances very well Real applications can be large E.g., Netflix has >20M users, >20k movies Available solvers do not scale to large problems Goal: Efficient and scalable algorithm for large-scale GBM instances 4
Framework Phase 1: Approximate LP 2 Compute “edge probabilities’’ 0.4 0.4 0.9 0.5 0.7 using linear programming 0.7 0.6 0.3 0.3 Phase 1 5
Framework Phase 1: Approximate LP 2 Compute “edge probabilities’’ 0.4 0.4 0.9 0.5 0.7 using linear programming 0.7 0.6 0.3 0.3 Phase 1 Phase 2: Round Select edges based on probabilities from phase 1 1 1 1 1 0 1 1 0 0 Phase 2 5
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) Gradient-based multiplicative weights update algorithm Approximately solves MPC LPs ( ε : approx. parameter) LB and UB constraints satisfied up to (1± ε ) Almost feasible Objective value (1- ε ) of the optimum Near-optimal Poly-log rounds Easy to implement: Each round involves matrix-vector multiplications 6
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 0.8 0.5 0.6 0.1 0.2 0.9 0.5 0.2 0.2 0 ≤ deg ≤ 2 7
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.3 0.9 0.4 1.3 0.8 0.5 0.6 0.1 0.2 0.9 0.5 0.2 0.2 0 ≤ deg ≤ 2 2.3 0.9 0.8 8
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.2 1.5 0.7 1.4 0.7 0.6 0.8 0.4 0.5 0.6 0.7 0.3 0.2 0 ≤ deg ≤ 2 2.1 1.4 1.3 9
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 0 ≤ deg ≤ 2 2.0 1.4 1.4 10
Phase 1: Computing edge probabilities Contrib. 1 Algorithm for Mixed Packing Covering (MPC) LPs (like GBM LP) 1 ≤ deg ≤ 2 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 0 ≤ deg ≤ 2 2.0 1.4 1.4 Contrib. 2 Distributed Processing for GBM (details in paper) Communication depends on # nodes not # edges All computation in parallel 10
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Naïve approach: Round independently using prob. from phase 1 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) ✖ I. Approx. guarantee preserved (in expectation) ✔ II. Naïve approach: Round independently using prob. from phase 1 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 11
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 12
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 0.4 0.4 0.9 0.5 0.7 0.7 0.6 0.3 0.3 2.0 1.4 1.4 13
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Cycle Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 2. Modify edge prob. on the 0.4 0.4 0.9 0.5 0.7 cycle/maximal path (details omitted) 0.7 0.6 0.3 0.3 If edge prob. = 1, include in solution If edge prob. = 0, remove from graph 2.0 1.4 1.4 14
Phase 2: Selecting edges Given: Edge probabilities from phase 1 Goal: Select edges to be in final solution s. t. LB and UB constraints satisfied (up to rounding) I. Approx. guarantee preserved (in expectation) II. Cycle Seq. algorithm [Gandhi et al. 06]: 1.1 1.7 1.0 1.0 1. Find a cycle/maximal path 2. Modify edge prob. on the 0.3 0.4 1 0.4 0.7 cycle/maximal path (details omitted) 0.7 0.7 0.3 0.3 If edge prob. = 1, include in solution If edge prob. = 0, remove from graph 2.0 1.4 1.4 15
Recommend
More recommend