facebook friends and matrix functions
play

Facebook Friends and Matrix Functions Graduate Research Day Joint - PowerPoint PPT Presentation

Facebook Friends and Matrix Functions Graduate Research Day Joint with Kyle Kloster David F David F. . Gleich Gleich, (Purdue), supported by Purdue University NSF CAREER 1149756-CCF Network Analysis Use linear algebra to study


  1. � Facebook Friends � and Matrix Functions Graduate Research Day Joint with Kyle Kloster � David F David F. . Gleich Gleich, (Purdue), supported by � Purdue University � NSF CAREER 1149756-CCF

  2. Network Analysis Use linear algebra to study graphs Graph

  3. Network Analysis Use linear algebra to study graphs Graph, G V , vertices (nodes) E , edges (links) degree of a node = # edges incident to it. nodes sharing an edge are neighbors .

  4. Network Analysis Erd ő s Number Use linear algebra to study graphs Facebook friends Twitter followers Graph, G Search engines Amazon/Netflix rec. V , vertices (nodes) Protein interactions E , edges (links) Power grids Google Maps Air traffic control Sports rankings Cell tower placement Scheduling Parallel programming Everything Kevin Bacon

  5. Network Analysis Use linear algebra to study graphs Graph Properties Diameter Is everything just a few hops away from everything else?

  6. Network Analysis Use linear algebra to study graphs Graph Properties Diameter Clustering Are there tightly-knit groups of nodes?

  7. Network Analysis Use linear algebra to study graphs Graph Properties Diameter Clustering How well can each Connectivity node reach every other node?

  8. Network Analysis Use linear algebra to study graphs Graph Properties Linear Algebra Diameter Eigenvalues and matrix Clustering functions shed light Connectivity on all these questions. These tools require a matrix related to the graph…

  9. Graph Matrices Adjacency matrix, A 1, if nodes i, j share an edge (are adjacent) A ij = 0 otherwise Random-walk transition matrix, P P ij = A ij / d j d j where is the degree of node j. Stochastic! i.e. column-sums = 1

  10. Network analysis via Heat Kernel Local clustering Uses include Link prediction Node centrality Heat kernel is… ∞ a graph diffusion k ! G k X 1 a function of a matrix exp ( G ) = k =0 random-walk, P For G , a network’s matrix adjacency, A Laplacian, L

  11. Heat Kernel describes node connectivity ( A k ) ij = # walks of length k from node i to j “sum up” the walks ∞ between i and j k ! ( A k ) ij X 1 exp ( A ) ij = k =0 For a small set of seed nodes, s , exp ( A ) s describes nodes most relevant to s

  12. Diffusion score “diffusion scores” of a graph = � weighted sum of probability vectors p 3 + … + + p 0 + p 1 p 2 c 3 c 1 c 2 c 0 diffusion score vector = f � random-walk P = transition matrix ∞ normalized X c k P k s f = s = seed vector weight on k =0 c k = stage k

  13. Heat Kernel vs. PageRank Diffusions Heat Kernel uses t k /k! � t 3 t 1 t 2 + p 3 + … + t 0 p 0 + p 1 p 2 Our work is new analysis and 3! 1! 2! 0! algorithms for this diffusion. PageRank uses 𝛽 k at stage k. � + … + p 3 p 0 + p 1 + p 2 𝛽 3 𝛽 1 𝛽 2 𝛽 0 Standard, widely-used diffusion we use for comparison. Linchpin of Google’s original success!

  14. Heat Kernel vs. PageRank Theory good fast clusters algorithm Local Cheeger Inequality: � existing constant-time PR “PR finds near-optimal algorithm clusters” [Andersen Chung Lang 06] HK

  15. Heat Kernel vs. PageRank Theory good fast clusters algorithm Local Cheeger Inequality: � existing constant-time PR “PR finds near-optimal algorithm clusters” [Andersen Chung Lang 06] Local Cheeger Inequality HK [Chung 07]

  16. Heat Kernel vs. PageRank Theory good fast clusters algorithm Local Cheeger Inequality: � existing constant-time PR “PR finds near-optimal algorithm clusters” [Andersen Chung Lang 06] Local Cheeger Inequality Our work � HK [Chung 07]

  17. Algorithm outline x ≈ exp ( P ) s ˆ (1) Approximate with a polynomial (2) Convert to linear system (Details in paper) (3) Solve with sparse linear solver

  18. Algorithm outline x ≈ exp ( P ) s ˆ (1) Approximate with a polynomial (2) Convert to linear system (Details in paper) (3) Solve with sparse linear solver Gauss-Southwell A x ( k ) ≈ b Sparse solver r ( k ) := b − A x ( k ) “relax” largest x ( k +1) := x ( k ) + A r ( k ) entry in r big

  19. Algorithm outline x ≈ exp ( P ) s ˆ (1) Approximate with a polynomial (2) Convert to linear system (Details in paper) (3) Solve with sparse linear solver Key: We avoid doing these full matrix-vector products N k ! P k s X 1 exp ( P ) s ≈ k =0

  20. Algorithm outline x ≈ exp ( P ) s ˆ (1) Approximate with a polynomial (2) Convert to linear system (Details in paper) (3) Solve with sparse linear solver (All my work was Key: We avoid doing these showing this full matrix-vector actually can products be done N with bounded k ! P k s X 1 exp ( P ) s ≈ error.) k =0

  21. Algorithms & Theory for x ≈ exp ( P ) s ˆ Algorithm 1, Weak Convergence O( e 1 ˜ - constant time on any graph, ε ) - outperforms PageRank in clustering - accuracy: k D − 1 x � D − 1 ˆ x k ∞ < ε

  22. Algorithms & Theory for x ≈ exp ( P ) s ˆ k D − 1 x � D − 1 ˆ x k ∞ < ε Conceptually Diffusion vector quantifies node’s connection to each other node. Divide each node’s score by its degree, delete the nodes with score < ε . Only a constant number of nodes remain in G! Users spend “reciprocated time” with O(1) others.

  23. Algorithms & Theory for x ≈ exp ( P ) s ˆ Algorithm 2, Global Convergence (conditional)

  24. Power-law Degrees Realworld graphs have degrees distributed as follows. This causes diffusions to be localized . 1e+07 1e+06 Power-law degrees 100000 Degrees of nodes 10000 rank in Ljournal-2008 1000 Log-log scale 100 10 1 0 1 10 100 1000 10000 indegree [Boldi et al., Laboratory for Web Algorithmics 2008]

  25. Local solutions Magnitude of entries Accuracy of approximation in solution vector using only large entries 1.5 0 0 10 10 − 2 − 2 10 10 1 magnitude − 4 − 4 10 10 1 − norm error 1 − norm error − 6 − 6 10 10 0.5 − 8 − 8 10 10 − 10 − 10 10 10 − 12 − 12 0 10 10 1 2 3 4 5 − 14 − 14 nnz = 4815948 6 10 10 x 10 ∞ 0 0 1 1 2 2 3 3 4 4 5 5 6 6 k ! A k s 10 10 10 10 10 10 10 10 10 10 10 10 10 10 X 1 has ~5 million nnz! largest non − zeros retained largest non − zeros retained k =0

  26. Local solutions Magnitude of entries Accuracy of approximation in solution vector using only large entries 1.5 0 0 10 10 − 2 − 2 10 10 1 magnitude − 4 − 4 10 10 1 − norm error 1 − norm error − 6 − 6 10 10 Only ~3,000 entries 0.5 − 8 − 8 10 10 − 10 − 10 10 10 For 10 -4 accuracy! − 12 − 12 0 10 10 1 2 3 4 5 − 14 − 14 nnz = 4815948 6 10 10 x 10 ∞ 0 0 1 1 2 2 3 3 4 4 5 5 6 6 k ! A k s 10 10 10 10 10 10 10 10 10 10 10 10 10 10 X 1 has ~5 million nnz! largest non − zeros retained largest non − zeros retained k =0

  27. Algorithms & Theory for x ≈ exp ( P ) s ˆ Algorithm 2, Global Convergence (conditional) - sublinear (power-law) O ( d log d (1 / ε ) C ) ˜ - accuracy: k x � ˆ x k 1 < ε

  28. Algorithms & Theory for x ≈ exp ( P ) s ˆ k x � ˆ x k 1 < ε Conceptually A node’s diffusion vector can be approximated with total error < ε using only O(d log d) entries. In realworld networks (i.e. with degrees following a power-law), no node will have nontrivial connection with more than O(d log d) other nodes.

  29. Experiments

  30. Runtime on the web-graph A particularly sparse graph benefits us best 140 EXMPV |V| = O(10^8) 120 GSQ |E| = O(10^9) GS 100 Time (sec) 80 60 GSQ, GS: our methods EXPMV: MatLab 40 20 0 0 10 20 30 Trial

  31. � Thank you Local clustering via heat kernel code available at http://www.cs.purdue.edu/homes/dgleich/codes/hkgrow Global heat kernel code available at http://www.cs.purdue.edu/homes/dgleich/codes/nexpokit/ Questions or suggestions? Email Kyle Kloster at kkloste-at-purdue-dot-edu

Recommend


More recommend