walk alkin ing ran andomly ly mas massiv ively ly an and
play

Walk alkin ing Ran andomly ly, Mas Massiv ively ly, an and - PowerPoint PPT Presentation

Walk alkin ing Ran andomly ly, Mas Massiv ively ly, an and Effic iciently ly Jakub cki Slobodan Mitrovi Krzysztof Onak Piotr Sankowski Why Random Walks? Web ratings [Page, Brin, Motwani, Winograd 99] [Berkhin 05]


  1. Walk alkin ing Ran andomly ly, Mas Massiv ively ly, an and Effic iciently ly Jakub Łącki Slobodan Mitrovi ć Krzysztof Onak Piotr Sankowski

  2. Why Random Walks? • Web ratings [Page, Brin, Motwani, Winograd ’ 99] [Berkhin ‘ 05] [Chierichetti, Haddadan ‘ 17] • Graph partitioning [Andersen, Chung, Lang ‘ 06] • Random spanning trees [Kelner, Mądry ‘ 09] • Laplacian solvers [Andoni, Krauthgamer, Pogrow ‘ 18] • Connectivity [Reif ’ 85] [Halperin, Zwick ’ 94] • Matching [Goel, Kapralov, Khanna ‘ 13] • Property testing [Goldreich, Ron ’ 99] [Kaufman, Krivelevich, Ron ‘ 04] [Czumaj, Sohler ’ 10] [Nachmias, Shapira ‘ 10] [Kale, Seshadhri ‘ 11] [Czumaj, Peng, Sohler ’ 15] [Chiplunkar, Kapralov, Khanna, Mousavifar, Peres ‘ 18] [Kumar, Seshadhri, Stolman ‘ 18] [Czumaj, Monemizadeh, Onak, Sohler ’ 19]

  3. How to Compute Random Walks? • Centralized [ direct implementation ] • Streaming [ Sarma, Gollapudi, Panigrahy ’ 11, Jin ‘ 19 ] • Distributed (CONGEST) [ Sarma, Nanongkai, Pandurangan, Tetali ’ 13 ] • MPC, undirected graphs (non-independent walks) [ Bahmani, Chakrabarti, Xin ’ 11 ]

  4. How to Compute Random Walks? • Centralized [ direct implementation ] • Streaming [ Sarma, Gollapudi, Panigrahy ’ 11, Jin ‘ 19 ] • Distributed (CONGEST) [ Sarma, Nanongkai, Pandurangan, Tetali ’ 13 ] • MPC, undirected graphs (non-independent walks) [ Bahmani, Chakrabarti, Xin ’ 11 ] Our result (undirected graphs): Independent random walks in MPC with sublinear memory per machine.

  5. Our Results Input : Undirected graph G; length L Output : An L-length random walk per vertex; walks mutually independent Rounds : O(log L) Space per machine : sublinear in n Total space : O(m L log n).

  6. Our Results Input : Undirected graph G; length L Output : An L-length random walk per vertex; walks mutually independent Rounds : O(log L) Space per machine : sublinear in n Total space : O(m L log n). Applications PageRank for Approximate Approximate Approximate connectivity and MST directed graph expansion testing bipartiteness testing

  7. Our Results Input : Undirected graph G; length L Output : An L-length random walk per vertex; walks mutually independent Rounds : O(log L) Space per machine : sublinear in n Total space : O(m L log n). Applications PageRank for Approximate Approximate Approximate connectivity and MST directed graph expansion testing bipartiteness testing

  8. Our Results Input : Undirected graph G; length L Output : An L-length random walk per vertex; walks mutually independent Rounds : O(log L) Space per machine : sublinear in n Total space : O(m L log n). Conditional lower- Applications bound of Ω (log L) PageRank for Approximate Approximate Approximate connectivity and MST directed graph expansion testing bipartiteness testing

  9. Random Walks in Undirected Graphs

  10. Random Walks: Doubling by Stitching Output : deg(v) L-length random walk per v; G walks mutually independent Track spare random walks. Use spare to double wanted ones. v

  11. Random Walks: Doubling by Stitching Output : deg(v) L-length random walk per v; G walks mutually independent w Track spare random 2 i walks. Use spare to double wanted ones. v

  12. Random Walks: Doubling by Stitching Output : deg(v) L-length random walk per v; G x walks mutually independent w 2 i Track spare random 2 i walks. Use spare to double wanted ones. v

  13. Random Walks: Doubling by Stitching Output : deg(v) L-length random walk per v; G x walks mutually independent w Track spare random 2 i+1 walks. Use spare to double wanted ones. v

  14. Random Walks: Doubling by Stitching Output : deg(v) L-length random walk per v; G x walks mutually independent w Track spare random 2 i+1 walks. Use spare to double wanted ones. v But how will w know a y priori how many walks will pass through it?

  15. Random Walks: Follow Stationary Distribution But how will w know a priori how many walks will pass through it?

  16. Random Walks: Follow Stationary Distribution G But how will w know a priori how many walks w will pass through it? Each vertex v maintains v proportionally to deg(v) random walks. y

  17. Random Walks: Follow Stationary Distribution G But how will w know a priori how many walks w will pass through it? In expectation , after t steps there are proportionally to deg(v) walks ending at v. Each vertex v maintains v proportionally to deg(v) random walks. y

  18. Random Walks: Takeaway 1. Following stationary distribution allows us to “predict” the future .

  19. Random Walks: Takeaway 1. Following stationary distribution allows us to “predict” the future . >=1/(2m) 2. The memory requirement is inversely proportional to the min entry of the stationary distribution.

  20. PageRank for Directed Graphs Input : Directed graph G D Output : (1+ α )-approximate PageRank; ε is the jumping probability 𝑃 ( ε -1 log log n) Rounds : ෨ Space per machine : sublinear in n 𝑃 ((m + n 1+o(1) ) ε -4 α -2 ). Total space : ෨

  21. (Prelude) Random Walks: Undirected vs Directed Undirected graphs Directed graphs vs

  22. (Prelude) Random Walks: Undirected vs Directed Undirected graphs Directed graphs Stationary distribution is easy to compute: deg(v) / (2m). vs Stationary distribution of v is “nicely” lower -bounded.

  23. (Prelude) Random Walks: Undirected vs Directed Undirected graphs Directed graphs Stationary distribution is easy Stationary distribution can to compute: deg(v) / (2m). be difficult to compute. vs Stationary distribution of v is “nicely” lower -bounded.

  24. (Prelude) Random Walks: Undirected vs Directed Undirected graphs Directed graphs Stationary distribution is easy Stationary distribution can to compute: deg(v) / (2m). be difficult to compute. vs Stationary distribution of v is Stationary distribution of v can be O(1/2 n ). “nicely” lower -bounded.

  25. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 𝑈 = 1 − 𝜗 𝑄 + 𝜗 𝑜 11 𝑈 Output : Stationary distribution of 𝑈

  26. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 𝑈 = 1 − 𝜗 𝑄 + 𝜗 Walk matrix of G. 𝑜 11 𝑈 Output : Stationary distribution of 𝑈 Jumping to a random vertex Following P with prob. 1 − 𝜗 .

  27. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 PageRank can be approximated from 𝑈 = 1 − 𝜗 𝑄 + 𝜗 𝑜 11 𝑈 random walks of 𝑈 . [Breyer ‘ 02] Output : Stationary distribution of 𝑈

  28. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 PageRank can be approximated from 𝑈 = 1 − 𝜗 𝑄 + 𝜗 𝑜 11 𝑈 random walks of 𝑈 . [Breyer ‘ 02] Output : Stationary distribution of 𝑈 Undirected graphs Directed graphs 𝑈 and 𝑄 are “similar” . vs

  29. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 PageRank can be approximated from 𝑈 = 1 − 𝜗 𝑄 + 𝜗 𝑜 11 𝑈 random walks of 𝑈 . [Breyer ‘ 02] Output : Stationary distribution of 𝑈 Undirected graphs Directed graphs We do not know stationary 𝑈 and 𝑄 are “similar” . distribution of 𝑈 . vs

  30. PageRank: Undirected vs Directed Graphs Input : 𝑄 = 𝐻𝐸 −1 PageRank can be approximated from 𝑈 = 1 − 𝜗 𝑄 + 𝜗 𝑜 11 𝑈 random walks of 𝑈 . [Breyer ‘ 02] Output : Stationary distribution of 𝑈 Undirected graphs Directed graphs We do not know stationary 𝑈 and 𝑄 are “similar” . distribution of 𝑈 . vs Stationary distribution of v Stationary distribution of v w.r.t. 𝑈 at least ε /n. w.r.t. to P can be O(1/2 n ).

  31. PageRank: Molding Undirected to Directed PageRank for PageRank for undirected G. directed G D .

  32. PageRank: Molding Undirected to Directed “Small” changes in 𝑈 require a “small” increase in the number of spare walks. PageRank for PageRank for undirected G. directed G D .

  33. PageRank: Molding Undirected to Directed “Small” changes in 𝑈 require a “small” increase in Random walks for the number of spare walks. (1- δ )G+ δ G D . PageRank for PageRank for undirected G. directed G D .

  34. PageRank: Molding Undirected to Directed PageRank can be approximated from “Small” changes in 𝑈 random walks of 𝑈 . [Breyer ‘ 02] require a “small” increase in Random walks for the number of spare walks. (1- δ )G+ δ G D . PageRank for PageRank for PageRank for (1- δ )G+ δ G D . undirected G. directed G D .

  35. PageRank: Molding Undirected to Directed PageRank can be approximated from “Small” changes in 𝑈 random walks of 𝑈 . [Breyer ‘ 02] require a “small” increase in Random walks for the number of spare walks. (1- δ )G+ δ G D . PageRank for PageRank for PageRank for PageRank for (1- δ )G+ δ G D . (1-2 δ )G+2 δ G D . undirected G. directed G D .

  36. PageRank: Molding Undirected to Directed PageRank can be approximated from “Small” changes in 𝑈 random walks of 𝑈 . [Breyer ‘ 02] require a “small” increase in Random walks for the number of spare walks. (1- δ )G+ δ G D . PageRank for PageRank for PageRank for PageRank for PageRank for . . . (1- δ )G+ δ G D . (1-2 δ )G+2 δ G D . undirected G. δ G+(1- δ )G D . directed G D .

Recommend


More recommend