edge weighted personalized pagerank breaking a decade old
play

Edge-Weighted Personalized PageRank: Breaking a Decade-Old - PowerPoint PPT Presentation

Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier W. Xie D. Bindel A. Demers J. Gehrke 12 Aug 2015 W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 1 / 1 PageRank Model Unweighted Node weighted


  1. Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier W. Xie D. Bindel A. Demers J. Gehrke 12 Aug 2015 W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 1 / 1

  2. PageRank Model Unweighted Node weighted Edge weighted Random surfer model: x ( t +1) = ↵ Px ( t ) + (1 � ↵ ) v where P = AD − 1 Stationary distribution: Mx = b where M = ( I � ↵ P ) , b = (1 � ↵ ) v W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 2 / 1

  3. Edge Weight vs Node Weight Personalization v i = v i ( w ) � ij = � ij ( w ) w 2 R d Introduce personalization parameters w 2 R d in two ways: Node weights: M x(w) = b(w) Edge weights: M(w) x(w) = b W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 3 / 1

  4. Edge Weight vs Node Weight Personalization Node weight personalization is well-studied Topic-sensitive PageRank: fast methods based on linearity Localized PageRank: fast methods based on sparsity Some work on edge weight personalization ObjectRank/ScaleRank: personalize weights for di ff erent edge types But lots of work incorporates edge weights without personalization Our goal : General, fast methods for edge weight personalization W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 4 / 1

  5. Model Reduction Expensive full model ( Mx = b ) ⇡ U Reduced model ( ˜ My = ˜ b ) = Reduced basis = Approximation ansatz Model reduction procedure from physical simulation world: O ffl ine : Construct reduced basis U 2 R n × k O ffl ine : Choose � k equations to pick approximation ˆ x = Uy Online : Solve for y ( w ) given w and reconstruct ˆ x W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 5 / 1

  6. Reduced Basis Construction: SVD (aka POD/PCA/KL) Snapshot matrix Σ V T x 1 x 2 . . . x r ⇡ U w r w 2 Sample points w 1 W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 6 / 1

  7. Approximation Ansatz Want r = MUy � b ⇡ 0. Consider two approximation conditions: Method Ansatz Properties U T r = 0 Bubnov-Galerkin Good accuracy empirically Fast for P ( w ) linear DEIM min k r I k Fast even for nonlinear P ( w ) Complex cost/accuracy tradeo ff Similar error analysis framework for both (see paper): Consistency + Stability = Accuracy Consistency: Does the subspace contain good approximants? Stability: Is the approximation subproblem far from singular? W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 7 / 1

  8. Bubnov-Galerkin Method y U T � b M U = 0 . Linear case: w i = probability of transition with edge type i X ! X ! w i P ( i ) ˜ w i ˜ P ( i ) M ( w ) = I � ↵ , M ( w ) = I � ↵ i i P ( i ) = U T P ( i ) U where we can precompute ˜ Nonlinear: Cost to form ˜ M ( w ) comparable to cost of PageRank! W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 8 / 1

  9. Discrete Empirical Interpolation Method (DEIM) y Equations in I � b = 0 . M U I Ansatz: Minimize k r I k for chosen indices I Only need a few rows of M (and associated rows of U ) Di ff erence from physics applications: high-degree nodes! W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 9 / 1

  10. Interpolation Costs Consider subgraph relevant to one interpolation equation: i 2 I . . . 1 / 3 1 / 50 Incoming neighbors of i Really care about weights of edges incident on I Need more edges to normalize (unless A ( w ) is linear) High in/out degree are expensive but informative Key question : how to choose I to balance cost vs accuracy ? W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 10 / 1

  11. Interpolation Accuracy Key: keep M I , : far from singular. If |I| = k , this is a subset selection over rows of MU . Have standard techniques (e.g. pivoted QR) Want to pick I once , so look at rows of ⇥ ⇤ Z = M ( w 1 ) U M ( w 2 ) U . . . for sample parameters w ( i ) . Helps to explicitly enforce P i ˆ x i = 1 Several heuristics for cost/accuracy tradeo ff (see paper) W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 11 / 1

  12. Online Costs If ` = # PR components needed, online costs are: Form ˜ O ( dk 2 ) for B-G M More complex for DEIM Factor ˜ O ( k 3 ) M O ( k 2 ) Solve for y Form Uy O ( k ` ) Online costs do not depend on graph size! (unless you want the whole PR vector) W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 12 / 1

  13. Example Networks DBLP (citation network) Weibo (micro-blogging) 3.5M nodes / 18.5M edges 1.9M nodes / 50.7M edges Seven edge types = ) Weight edges by topical seven parameters similarity of posts P ( w ) linear Number of parameters = Competition: ScaleRank number of topics (5, 10, 20) (Studied global and local PageRank – see paper for latter.) W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 13 / 1

  14. Singular Value Decay 10 6 DBLP-L Weibo-S5 10 5 Weibo-S10 10 4 Weibo-S20 Value 10 3 10 2 10 1 10 0 10 -1 0 50 100 150 200 i th Largest Singular Value r = 1000 samples, k = 100 W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 14 / 1

  15. DBLP Accuracy 10 0 Kendall@100 10 -1 Normalized L1 10 -2 10 -3 10 -4 10 -5 G D D D S c a E E E a l I I I e l M M M e r k R - - - i a 1 1 2 n n 0 2 0 0 0 0 k W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 15 / 1

  16. DBLP Running Times (All Nodes) 0.7 Coefficients 0.6 Construction Running time (s) 0.5 0.4 0.3 0.2 0.1 0 G D D D S c a E E E a l I I I e l M M M e r k R - - - i a 1 1 2 n n 0 2 0 0 0 0 k W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 16 / 1

  17. Weibo Accuracy Kendall@100 10 -1 Normalized L1 10 -2 10 -3 10 -4 5 10 20 # Parameters W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 17 / 1

  18. Weibo Running Times (All Nodes) 0.5 Coefficients Construction Running time (s) 0.4 0.3 0.2 0.1 0 5 10 20 # Parameters W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 18 / 1

  19. Application: Learning to Rank Goal: Given T = { ( i q , j q ) } | T | q =1 , find w that mostly ranks i q over j 1 . (c.f. Backstrom and Leskovec, WSDM 2011) Standard: Gradient descent on full problem One PR computation for objective One PR computation for each gradient component Costs d + 1 PR computations per step With model reduction Rephrase objective in reduced coordinate space Use factorization to solve PR for objective Re-use same factorization for gradient W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 19 / 1

  20. DBLP Learning Task 400 Standard Objective Function Value Galerkin 350 DEIM-200 300 250 200 150 100 0 2 4 6 8 10 12 14 16 18 20 Iteration (8 papers for training + 7 params) W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 20 / 1

  21. The Punchline Test case: DBLP, 3.5M nodes, 18.5M edges, 7 params Cost per Iteration: Method Standard Bubnov-Galerkin DEIM-200 Time(sec) 159.3 0.002 0.033 W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 21 / 1

  22. Roads Not Taken In the paper (but not the talk) Selecting interpolation equations for DEIM Localized PageRank experiments (Weibo and DBLP) Comparison to BCA for localized PageRank Quasi-optimality framework for error analysis Room for future work! Analysis, applications, systems, ... W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 22 / 1

  23. Questions? Edge-Weighted Personalized PageRank: Breaking a Decade-Old Performance Barrier Wenlei Xie, David Bindel, Johannes Gehrke, and Al Demers KDD 2015, paper 117 Sponsors: NSF (IIS-0911036 and IIS-1012593) iAd Project from the National Research Council of Norway W. Xie, D. Bindel , A. Demers, J. Gehrke KDD2015 12 Aug 2015 23 / 1

Recommend


More recommend