optimization of network topology
play

Optimization of Network Topology Elias Boutros Khalil, Bistra - PowerPoint PPT Presentation

Scalable Diffusion-Aware Optimization of Network Topology Elias Boutros Khalil, Bistra Dilkina, Le Song Georgia Institute of Technology Problem Given G(V,E), a set of source nodes X (infected nodes) Linear Threshold Model


  1. Scalable Diffusion-Aware Optimization of Network Topology Elias Boutros Khalil, Bistra Dilkina, Le Song Georgia Institute of Technology

  2. Problem • Given • G(V,E), • a set of source nodes X (infected nodes) • Linear Threshold Model • Find a set of k edges to • remove, s.t., the spread of a certain substance is minimized • add, s.t., the spread of a certain substance is maximized 2

  3. Review: Diffusion Models • Linear Threshold Model • Each edge has a weight Wuv • each node u chooses a threshold uniformly at random in [0,1] • Node v will be infected if • Independent Cascade Model • Each edge has a propagation probability Puv • Each infected node u has only one chance to infect its neighbor v with prob. Puv 3

  4. Review: Influence Maximization • Given • G(V,E) • LT model or IC model • To find k nodes to activate to maximize the spread of a certain substance • Greedy algorithm • Objective function is submodular • (1-1/e)-appriximation 4

  5. Edge Deletion Problem • Given G, source set A, • Find k edges • Supermodular • Greedy algorithm provides (1-1/e)- approximation • Scaling up tricks 5

  6. Edge Addition Problem • Given G, source set A, • Find k edges • Still supermodular (Equivalent to constrained submodular minimization) • Algorithm: max. the lowerbound 6

  7. Edge Addition Problem • Marginal Gain is bounded • Apply an approach for constrained submodular minimization with approximation guarantees R. Iyer, S. Jegelka, and J. Bilmes. Fast semidifferential based submodular function optimization. In ICML, 2013. 7

  8. Experiments • Datasets • Syntetic dataset: generated by Kronecker graph model • (1) CorePeriphery, (2) ErdosRenyi and (3) Hierarchical • Real datasets: 8

  9. Experiments • Competing heuristics • Random • Weights: highest weights • Betweenness • Eigen: k edges to max the leading eigendrop • Degree: k edges whose destination nodes have the highest out-degrees [8] 9

  10. Experiments Edge deletion Edge addition 10

  11. Core Decomposition of Uncertain Graphs Francesco Bonchi, Francesco Gullo, Andreas Kaltenbrunner, Yana Volkovich Yahoo Labs, Spain

  12. Core decomposition • k-core of a graph • a maximal subgraph in which every vertex is connected to at least k other vertices within that subgraph • Core decomposition • The set of all k-cores of a graph G forms the core decomposition of G 12

  13. K-core under uncertain graphs • A maximal subgraph whose vertices have at least k neigbours in that subgraph with probability no less than η 13

  14. Example 14

  15. Motivation • core decomposition can be computed efficiently in deterministic graphs • computed in linear time • However, does not guarantee efficiency in uncertain graphs • even the simplest graph operations may become computationally intensive. • uncertain graph • edges are assigned a probability of existence • E.g.:, protein-interaction, the influence of one person on another 15

  16. Applications • Influence maximization • Idea: just reduce the input graph G by keeping only the inner-most η -shells • the higher the core index is, the more likely the vertex is an influential spreader [17] • Task-driven team formation • Node: individuals; edge: a probabilistic topic model • Given a pair <T,Q> where T is the set of terms, Q is a set of nodes • Goal: Find a node of nodes A where Q ⊆ A, which a good team to perform the task in T • Solution: find a connected component of (k, η )-core which contains A 16

  17. Algorithm framework the maximum degree such that the probability for v to have that degree is no less than η Non-trivial to compute Follow the deterministic case 17

  18. Experiments Influence Maximization Task-driven Team-formation 18

  19. Fast Influence-based Coarsening for Large Networks Manish Purohit ^ , B. Aditya Prakash *, Chanhyun Kang ^ , Yao Zhang * , V S Subrahmanian ^ *Virginia Tech ^University of Maryland KDD, New York City August 26, 2014

  20. Networks are getting huge! Flickr (friendship network): 87 million Amazon (friendship network): 237 million users and 8 billion photos until 2013 accounts until 2013 Facebook (friendship network): 829 Twitter (follower network): 271 million million daily active users on average in monthly active users 20 June 2014 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  21. Need for fast analysis • Ever growing list of applications of network effects • Viral Marketing • Immunization • Information Diffusion • … However, scaling up traditional algorithms up to millions of nodes is hard 21 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  22. How to handle large-scale networks • Approaches • Use faster / simpler algorithms • Perform analysis locally • i.e., divide the large network into smaller subgraphs • Zoom-out the network to obtain a smaller representation of the network this paper 22 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  23. Bird’s eye view of a network 23 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  24. Bird’s eye view of a network • “Zoom - out” of the graph to get a quick picture A D D A Zoom-out C C B B F E F E Small representation Big graph of the network Called “coarsen” in this paper 24 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  25. Outline • Motivation • Challenges • Problem Definition • Our Proposed Method • Experiments • Applications • Conclusion 25 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  26. Challenges • C1: How do we maintain diffusive characteristics when coarsening networks? • C2: How do we merge node to get the coarse network? • C3: how do we find the best node to merge fast? 26 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  27. C1: Information Diffusion • Cascading behavior in networks Blogs 1 Posts B 1 B 2 1 1 2 B 3 3 Links B 4 Information Blog network cascade Source: [McGlohon et. al., SDM2007] Diffusion is graph induced by a time ordered propagation of information (edges) 27 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  28. C1: Model information diffusion • Information spreads over networks • e.g.:, rumor/meme spreads over Twitter following network • Independent cascade model (IC ) [Kempe+, KDD03] • Weights p ij : propagation prob. from i to j • Each node has only one chance to infect its neighbors Meme spreading 28 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  29. C1: Diffusive characteristics • First eigenvalue λ 1 (of adjacency matrix) is enough for most diffusion models. (Prakash et al. [ICDM’12]) λ 1 is the epidemic threshold “ Safe” “Vulnerable” “Deadly” Increasing λ 1 , Increasing vulnerability 29 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  30. C1: maintain diffusive characteristics • Goal: maintain the diffusive characteristics of the original network in the coarsened network? Make the coarsened network has the least change in the first eigenvalue A D D A coarsen C C B B F E F E Original network Coarsened network 30 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  31. C2: How to merge nodes • Goal: Merge nodes of graph G to get the coarsened graph that “approximates” G with respect to diffusion Original network Merge b and a can 0.375! get the least change of λ 1 Is this correct? Influence from d to b: 0.5 Influence from d to a: 0.25 Average: 0.375 31 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  32. Details C2: How to merge nodes • In general: Merging a,b 32 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  33. C3: which nodes to merge • Goal: • Find the best nodes to merge • Fast, scalable to large network Talk about it later A D D A coarsen C C B B F E F E Coarsened network Original network 33 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  34. Outline • Motivation • Challenges • Problem Definition • Our Proposed Method • Experiments • Applications • Conclusion 34 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  35. Problem Definition Graph Coarsening Problem (GCP) Given: large graph G(V, E), and reduction factor α Find: the best set of edges to merge Such that: | λ G - λ H | is minimized • (i.e. H is the coarsened graph with the least change in the first eigenvalue) 35

  36. Naive Greedy Heuristic Step: • Score every edge by the change in eigenvalue • Greedily choose the edge (a,b) with the least score, and merge (a,b) • Re-evaluate the scores of every edge and repeat • Too slow! O(m 2 ) time to score all edges • Lose time benefits of analyzing the smaller graph 36 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  37. Outline • Motivation • Problem Definition • Challenges • Our Proposed Method • CoarseNet • Experiments • Applications • Conclusion 37 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

  38. CoarseNet: idea • Can we approximate the edge scores faster? • Yes! • Use matrix perturbation arguments to estimate (up to first order terms) the score of an edge in constant time! • Score all edges in O(m) time • Naive Heuristic: O(m 2 ) time 38 Purohit, Prakash, Kang, Zhang, Subrahmanian 2014

Recommend


More recommend