hierarchically modular structure in complex networks
play

Hierarchically Modular Structure in Complex Networks Aaron Clauset - PowerPoint PPT Presentation

Hierarchically Modular Structure in Complex Networks Aaron Clauset Santa Fe Institute 3 November 2008 DIMACS / DyDAn Network Models of Biological and Social Contagion Modular Hierarchies herbivore parasite plant


  1. Hierarchically Modular Structure in Complex Networks Aaron Clauset Santa Fe Institute 3 November 2008 DIMACS / DyDAn “Network Models of Biological and Social Contagion”

  2. Modular Hierarchies herbivore → parasite → → plant Grassland species* *thank you: Jennifer Dunne

  3. Modular Hierarchies c

  4. Modular Hierarchies c

  5. The Task How can we extract • this hierarchical (multi-scale) structure from complex networks? network c hierarchy ? →

  6. One Approach Model-based inference 1. describe how to generate hierarchies (a model) 2. “fit” model to empirical data 3. test “fitted” model 4. extract predictions + insight 5. profit!

  7. A Model of Hierarchy

  8. A Model of Hierarchy D , { p r } assortative modules → probability p r

  9. model “inhomogeneous” random graph → → j i instance → i j Pr( i, j connected) = p r = p (lowest common ancestor of i,j )

  10. Model Features • explicit model = explicit assumptions • very flexible (many parameters) • captures structure at all scales • arbitrary mixtures of assortativity, disassortativity • learnable directly from data

  11. Learning From Data a direct approach • likelihood function L = Pr( data | model ) ( scores quality of model) • sample the good models via Markov chain Monte Carlo • technical details in arXiv : physics/0610051

  12. From Graph to Ensemble

  13. From Graph to Ensemble • Given graph G • run MCMC to equilibrium • then, for each sampled , draw a resampled D G � graph from ensemble A test: do resampled graphs look like original?

  14. herbivore → → plant → parasite Grassland species* *thank you: Jennifer Dunne

  15. Degree Distribution a 0 10 Fraction of vertices with degree k original → ! 1 10 ! 2 10 → resampled ! 3 10 0 1 10 10 Degree, k

  16. Clustering Coefficient Fraction of graphs with clustering coefficient c 0.25 original → original 0.2 → 0.15 0.1 → → resampled resampled 0.05 0 0 0.05 0.1 0.15 0.2 0.25 0.3 Clustering coefficient, c

  17. Distance Distribution b 0 10 Fraction of vertex ! pairs at distance d original → ! 1 10 → ! 2 10 resampled ! 3 10 2 4 6 8 10 Distance, d

  18. Missing Links A test: can model predict missing links?

  19. Predicting is Hard • remove edges from G k • how easy to guess a missing link? k p guess ≈ n 2 − m + k = O ( n − 2 ) n = 75 m = 113 p guess = k/ (2662 + k )

  20. Predicting Missing Links • Given incomplete graph G • run MCMC to equilibrium � p r � • then, over sampled , compute average D ( i, j ) �∈ G for links � p r � • predict links with high values are missing Test idea via leave- k -out cross-validation perfect accuracy: AUC = 1 no better than chance: AUC = 1/2

  21. Missing Structure Grassland species network 1 Pure chance Common neighbors 0.9 Jaccard coeff. hierarchy Degree product Area under ROC curve → Shortest paths 0.8 Hierarchical structure AUC 0.7 → simple predictors 0.6 → 0.5 pure chance 0.4 0 0.2 0.4 0.6 0.8 1 Fraction of edges observed, k/m

  22. Other Networks Terrorist association network a 1 Pure chance Common neighbors 0.9 Jaccard coefficient Degree product Shortest paths 0.8 Hierarchical structure AUC 0.7 b T. pallidum metabolic network 1 Pure chance 0.6 Common neighbors 0.9 Jaccard coefficient Degree product 0.5 Shortest paths 0.8 Hierarchical structure 0.4 0 0.2 0.4 0.6 0.8 1 AUC Fraction of edges observed 0.7 0.6 0.5 0.4 0 0.2 0.4 0.6 0.8 1 Fraction of edges observed

  23. Summary • Many real networks are hierarchically modular • Hierarchies can • model multi-scale structure • generalize a single network • predict missing links • Model-based inference is very powerful Acknowledgments : C. Moore, M.E.J. Newman, C.H. Wiggins, and C.R. Shalizi

  24. Fin

Recommend


More recommend