Local clustering with � graph diffusions and � spectral solution paths Joint with Kyle Kloster � David F David F. . Gleich Gleich, (Purdue), supported by � Purdue University � NSF CAREER 1149756-CCF
Local Clustering Given seed(s) S in G , find a good cluster near S seed
Local Clustering Given seed(s) S in G , find a good cluster near S seed “Near”? -> local, small containing S “Good”? -> low conductance
Low-conductance sets are clusters # edges leaving T conductance( T ) = # edge endpoints in T (for small sets T, i.e. vol(T) < vol(G)/2) = “ chance a random edge � that touches T exits T ”
Low-conductance sets are clusters # edges leaving T conductance( T ) = # edge endpoints in T (for small sets T, i.e. vol(T) < vol(G)/2) For a global cluster, could use Fiedler… But we want a local cluster
Fiedler Compute Fiedler vector, v : L v = λ 2 D v “Sweep” over v : 1. sort: v (1) ≥ v (2) ≥ · · · 2. for each set S k = (1,…,k) compute conductance φ ( S k ) 3. output best S k
Fiedler Compute Fiedler vector, v : L v = λ 2 D v Cheeger Inequality: � Fiedler finds a cluster “not “Sweep” over v : too much worse” than � 1. sort: global optimal v (1) ≥ v (2) ≥ · · · 2. for each set S k = (1,…,k) But we want local… compute conductance φ ( S k ) 3. output best S k
Local Fiedler and diffusions [Mahoney Orecchia Vishnoi 12] “A local spectral method…” Fiedler L v = D v [ λ ] with local bias (MOV) L v = D v [ λ ] + “ s ” (normalized seed vector s ) THM: MOV is a scaling of personalized PageRank*!
Local Fiedler and diffusions Intuition: why MOV ~ PageRank Fiedler L v = D v [ λ ] with local bias L v = D v [ λ ] + “ s ” ( I − D − 1 / 2 AD − 1 / 2 )ˆ v = ˆ v [ λ ] + “ s ” AD − 1 ˆ v = ˆ v [1 − λ ] + “ s ” PageRank vector, � ( I − α P ) ˆ v = “ s ” a diffusion
PageRank and other diffusions “Personalized” PageRank (PPR) [Andersen, Chung, Lang 06]: local Cheeger inequality � and fast algorithm, “Push” procedure Diffusion perspective Standard setting α k P k ˆ X x = ( I − α P ) x = ˆ s s k =0
PageRank and other diffusions α k P k ˆ X x = s “Personalized” PageRank (PPR) k =0 [Andersen, Chung, Lang 06]: local Cheeger inequality � and fast algorithm, “Push” procedure k ! P k ˆ t k X Heat Kernel diffusion (HK) � f = s (many more!) k =0 0 10 α =0.99 Various diffusions Weight explore different − 5 10 aspects of graphs. α =0.85 t=1 t=5 t=15 0 20 40 60 80 100 Length
Diffusions, theory & practice good fast conductance algorithm Local Cheeger Inequality [Andersen Chung Lang 06] PR “PPR-push” is O(1/( ε (1- 𝛽 ))) Local Cheeger Inequality [K., Gleich 2014] HK [Chung 07] “HK-push” is O(e t C/ ε ) [Avron, Horesh 2015] TDPR Open question Gen � This talk Open question Diff
Diffusions, theory & practice good fast conductance algorithm Local Cheeger Inequality [Andersen Chung Lang 06] PR “PPR-push” is O(1/( ε (1- 𝛽 ))) Local Cheeger Inequality [K., Gleich 2014] HK [Chung 07] “HK-push” is O(e t C/ ε ) [Avron, Horesh 2015] TDPR Open question Gen � This talk Open question Diff David Gleich and I are working with Olivia Simpson (a student of Fan Chung’s)
General diffusions: intuition A diffusion propagates “rank” from a seed across a graph. seed = high � diffusion value � = low � = local cluster / � low-conductance set �
General diffusions A diffusion propagates “rank” from a seed across a graph. General diffusion vector c k P k ˆ X f = s k =0 f = + … + p 3 p 0 + p 1 + p 2 c 3 c 0 c 1 c 2 Sweep over f !
General algorithm k D − 1 ( f � ˆ 1. Approximate f so f ) k ∞ ✏ D − 1 ˆ 2. Scale, f 3. Then sweep! How to do this efficiently?
Algorithm Intuition From parameters c k , ε , seed s … � Starting from here… seed seed p 3 … p 0 p 1 p 2 How to end up here? + p 3 + … + p 0 + p 1 p 2 c 3 c 1 c 2 c 0
Algorithm Intuition Begin with mass at seed(s) seed seed in a “residual” staging area, r 0 r 3 … r 0 r 1 r 2 The residuals r k hold mass that is unprocessed – it’s like error Idea : “push” any entry r k (j)/ d j > (some threshold) + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , r 3 … r 0 r 1 r 2 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � c 1 (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r � + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � c 2 (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r (repeat) + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 (repeat) + p 3 + … + c 3 p 0 + p 1 p 2 c 1 c 2 c 0
Push Operation push – (1) remove entry in r k , � (2) put in f , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 (repeat) c 3 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Thresholds entries < threshold ERROR equals weighted sum of entries left in r k r 3 … r 0 r 1 r 2 à Set threshold so “leftovers” sum to < ε + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Thresholds entries < threshold ERROR equals weighted sum of entries left in r k r 3 … r 0 r 1 r 2 à Set threshold so “leftovers” sum to < ε Threshold for stage r k is � 0 1 ∞ X ✏ / c j @ A + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0 j = k +1 k D − 1 ( f � ˆ Then f ) k ∞ ✏
Another perspective Fiedler L v = D v [ λ ] with local bias L v = D v [ λ ] + “ s ” ( I − D − 1 / 2 AD − 1 / 2 )ˆ v = ˆ v [ λ ] + “ s ” AD − 1 ˆ v = ˆ v [1 − λ ] + “ s ” PageRank vector, � ( I − α P ) ˆ v = “ s ” a diffusion
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S Mix-product property � V k + ¯ P ˆ V k Γ = ˆ S For Kronecker product
Another perspective L V k = D V k Λ k Fiedler with local bias L V k = D V k Λ k + S V k Λ k + ˆ ( I − D − 1 / 2 AD − 1 / 2 ) ˆ V k = ˆ S AD − 1 ˆ V k ( I − Λ k ) + ˆ V k = ˆ S Mix-product property � V k + ¯ P ˆ V k Γ = ˆ S For Kronecker product ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S )
Another perspective ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S ) ( I − α P ) ˆ v = ˜ s - generalizes PageRank to “matrix teleportation parameter” Γ = ( I − Λ k ) − 1 Standard spectral approach:
Another perspective ( I − Γ T ⊗ P )vec( ˆ V k ) = vec( ˜ S ) ( I − α P ) ˆ v = ˜ s - generalizes PageRank to “matrix teleportation parameter” c 0 ˜ 0 ... Our framework 0 Γ = is equivalent to: ... c N ˜ 0 (Details in [K., Gleich KDD 14])
General diffusions: conclusion THM : For diffusion coefficients c k >= 0 satisfying N ∞ “rate of X and X c k = 1 c k ≤ ✏ / 2 decay” k =0 k =0 “generalized push” approximates the diffusion f k D − 1 ( f � ˆ on a symmetric graph so that f ) k ∞ ✏ in work bounded by O (2 N 2 / ✏ ) Constant for any inputs! (If diffusion decays fast)
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k N − 1 3. Total work is # pushes: X X d ( j t ) t =1 k =0
Push Recap d(j) work push – (1) remove entry in r k , � (2) put in p , (3) then scale and r 3 … r 0 r 1 r 2 spread to neighbors in next r c 2 c 3 + p 3 + … p 0 + p 1 + p 2 c 3 c 1 c 2 c 0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k N − 1 3. Total work is # pushes: X X d ( j t ) t =1 k =0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k m k N − 1 N − 1 3. Total work is # pushes: X X X X r k ( j t )(2 N ) / ✏ d ( j t ) ≤ t =1 k =0 t =1 k =0
Proof sketch N X 1. Stop pushing after N terms. c k ≤ ✏ / 2 k =0 2. Push residual entries in first N terms if r k ( j ) ≥ d ( j ) ✏ / (2 N ) m k m k N − 1 N − 1 3. Total work is # pushes: X X X X r k ( j t )(2 N ) / ✏ d ( j t ) ≤ t =1 k =0 t =1 k =0 m k 4. Each r k sums to <= 1 X r k ( j t ) ≤ 1 (each push is added to f , which sums to 1) t =1 O (2 N 2 / ✏ )
Solutions Paths Benefit of these “push” diffusions? A direct decomposition is a black box: Feed in input, get output. In contrast, the iterative nature of “push” means running the algorithm is essentially “watching” the diffusion process occur.
Recommend
More recommend