fast incremental von neumann graph entropy computation
play

Fast Incremental von Neumann Graph Entropy Computation: Theory, - PowerPoint PPT Presentation

Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications Pin-Yu Chen IBM Research AI joint work with Lingfei Wu (IBM Research AI) Sijia Liu (IBM Research AI) Indika Rajapakse (Univ. Michigan Ann Arbor)


  1. Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications Pin-Yu Chen IBM Research AI joint work with Lingfei Wu (IBM Research AI) Sijia Liu (IBM Research AI) Indika Rajapakse (Univ. Michigan Ann Arbor) Poster: Tuesday 6:30-9:00 pm, Pacific Ballroom #265 June 10, 2019 P.-Y. Chen ICML 2019 June 10, 2019 1 / 16

  2. Graph as a Data Representation P.-Y. Chen ICML 2019 June 10, 2019 2 / 16

  3. Information-Theoretic Measures between Graphs Structural reducibility of multilayer networks (unsupervised learning) De Domenico et al., ”Structural reducibility of multilayer networks.” Nature Communications 6 (2015). P.-Y. Chen ICML 2019 June 10, 2019 3 / 16

  4. Von Neumann Graph Entropy (VNGE): Introduction Quantum information theory: Φ is a n × n density matrix that is symmetric, positive semidefinite, and trace ( Φ ) = 1 { λ i } n i =1 : eigenvalues of Φ Von Neumann entropy H = − trace ( Φ ln Φ ) = − � i : λ i > 0 λ i ln λ i i =1 , since � → Shannon entropy over eigenspectrum { λ i } n i λ i = 1 ⇒ Generally requires O ( n 3 ) computation complexity for H Graph G = ( V , E , W ) ∈ G : undirected weighted graphs with nonnegative edge weights. G has |V| = n nodes and |E| = m edges. L = D − W : combinatorial graph Laplacian matrix of G . D = diag ( { λ i } ) : diagonal degree matrix. [ W ] ij = w ij : edge weight. Von Neumann graph entropy (VNGE): Φ = L N = c · L , where 1 1 1 c = trace ( L ) = � i ∈V d i = 2 � ( i,j ) ∈E w ij H ≤ ln( n − 1) , “ = ” when G is a complete graph with identical edge weight Braunstein, Samuel L., Sibasish Ghosh, and Simone Severini. ”The Laplacian of a graph as a density matrix: a basic combinatorial approach to separability of mixed states.” Annals of Combinatorics 10.3 (2006): 291-317. Passerini, Filippo, and Simone Severini. ”The von Neumann entropy of networks.” (2008). P.-Y. Chen ICML 2019 June 10, 2019 4 / 16

  5. Von Neumann Graph Entropy (VNGE): Introduction VNGE characterizes structural complexity of a graph and enables computation of Jensen-Shannon distance (JSdist) between graphs. Applications in network learning, computer vision and data science: Structural reducibility of multilayer networks (hierarchical clustering) 1 De Domenico et al., ”Structural reducibility of multilayer networks.” Nature Communications 6 (2015). Depth-analysis for image processing 2 Han, Lin, et al. ”Graph characterizations from von Neumann entropy.” Pattern Recognition Letters 33.15 (2012): 1958-1967. Bai, Lu, and Edwin R. Hancock. ”Depth-based complexity traces of graphs.” Pattern Recognition 47.3 (2014): 1172-1186. Network-ensemble comparison via edge rewiring 3 Li, Zichao, Peter J. Mucha, and Dane Taylor. ”Network-ensemble comparisons with stochastic rewiring and von Neumann entropy.” SIAM Journal on Applied Mathematics, 78(2): 897920 (2018). Structure-function analysis in genetic networks 4 Liu et al., ”Dynamic network analysis of the 4D nucleome.” bioRxiv, pp. 268318 (2018). High consistency with classical Shannon graph entropy that is defined as a probability distribution of a function on subgraphs of G . Anand, Kartik, Ginestra Bianconi, and Simone Severini. ”Shannon and von Neumann entropy of random networks with heterogeneous expected degree.” Physical Review E 83.3 (2011): 036109. Anand, Kartik, and Ginestra Bianconi. ”Entropy measures for networks: Toward an information theory of complex topologies.” Physical Review E 80.4 (2009): 045102. Li, Angsheng, and Yicheng Pan. ”Structural Information and Dynamical Complexity of Networks.” IEEE Transactions on Information Theory 62.6 (2016): 3290-3339. P.-Y. Chen ICML 2019 June 10, 2019 5 / 16

  6. Outline The main challenge of exact VNGE computation: it generally requires cubic complexity O ( n 3 ) for obtaining the full eigenspectrum → NOT scalable to large graphs Our solution: FINGER , a scalable and provably asymptotically correct approximate computation framework of VNGE FINGER supports two different data modes: batch and online (a) Batch mode: O ( n + m ) (b) Online mode: O (∆ n + ∆ m ) New applications: Anomaly detection in evolving Wikipedia hyperlink networks 1 Bifurcation detection of cellular networks during cell reprogramming 2 Synthesized denial of service attack detection in router networks 3 P.-Y. Chen ICML 2019 June 10, 2019 6 / 16

  7. Efficient VNGE Computation via FINGER Recall H = − � n i =1 λ i ln λ i ⇒ O ( n 3 ) cubic complexity FINGER enables fast and incremental computation of H with asymptotic approximation guarantee Lemma (Quadratic approximation of H ) The quadratic approximation of the von Neumann graph entropy H via Taylor expansion is equivalent to Q = 1 − c 2 ( � i + 2 · � i ∈V d 2 ( i,j ) ∈E w 2 ij ) d i : degree (sum of edge weights) of node i w ij : edge weight of edge ( i, j ) 1 c = 2 � ( i,j ) ∈E w ij O ( n + m ) linear complexity. |V| = n , |E| = m . Q can be incremental updated given graph changes ∆ G ⇒ O (∆ n + ∆ m ) complexity P.-Y. Chen ICML 2019 June 10, 2019 7 / 16

  8. Approximate VNGE with Asymptotic Guarantees Let λ max ( λ min ) be the largest (smallest) positive eigenvalue in { λ i } Approx. VNGE for batch graph sequence: � H ( G ) = − Q ln λ max Approx. VNGE for online graph sequence: � H ( G ) = − Q ln(2 c · d max ) Relation: � H ≤ � H ≤ H Theorem ( o (ln n ) approximation error with balanced eigenspectrum) If the number of positive eigenvalues n + = Ω( n ) and λ min = Ω( λ max ) , the scaled approximation error (SAE) H − � ln n → 0 and H − � H H ln n → 0 as n → ∞ . h ( n ) = 0 , and lim sup n →∞ | f ( n ) f ( n ) f ( n ) = o ( h ( n )) and f ( n ) = Ω( h ( n )) mean lim n →∞ h ( n ) | > 0 , respectively. Computing λ max only requires O ( n + m ) operations via power iteration ⇒ O ( n + m ) linear complexity for � H . Theorem (Incremental update of � H with O (∆ n + ∆ m ) complexity) The VNGE � H ( G ⊕ ∆ G ) can be updated by � H ( G ⊕ ∆ G ) = F ( � H ( G ) , ∆ G ) P.-Y. Chen ICML 2019 June 10, 2019 8 / 16

  9. Numerical Validation on Synthetic Random Graphs Erdos-Renyi graphs Watts-Strogatz graphs approx. error 0.2 approx. error 0.08 scaled scaled 0.06 0.1 0.04 0 0.02 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 number of nodes number of nodes reduction ratio (%) computation time computation time reduction ratio (%) 100 100 d = 2 p WS = 0 90 d = 5 p WS = 0 . 1 80 p WS = 0 . 2 d = 10 80 p WS = 0 . 4 d = 20 p WS = 0 . 6 d = 50 70 60 p WS = 0 . 8 1000 2000 3000 4000 5000 d = 100 1000 2000 3000 4000 5000 number of nodes number of nodes p WS = 1 d = 200 Figure: Scaled approximation error (SAE) and computation time reduction ratio scaled approximation error (SAE) = H − H approx ln n Time H − Time H approx computation time reduction ratio = Time H almost 100% speed-up ( O ( n 3 ) v.s. O ( n + m ) ) approximation error decreases as average degree increases regular (random) graphs have smaller (larger) approximation error P.-Y. Chen ICML 2019 June 10, 2019 9 / 16

  10. Jensen-Shannon Distance between Graphs using FINGER Two graphs G and � G of the same node set V . KL divergence D KL ( G | � G ) = trace ( L N ( G ) · [ln L N ( G ) − ln L N ( � G )]) (not symmetric) Let G = G ⊕ � denote the averaged graph of G and � G G , where 2 L N ( G ) = L N ( G )+ L N ( � G ) . 2 The Jensen-Shannon divergence is defined as DIV JS ( G, � G ) = 2 D KL ( G | � 2 D KL ( � 2 [ H ( G ) + H ( � 1 G ) + 1 G | G ) = H ( G ) − 1 G )] (symmetric) G ) = √ DIV JS , The Jensen-Shannon distance is defined as JSdist ( G, � which is proved to be a valid distance metric. Briet, Jop, and Peter Harremos. ”Properties of classical and quantum Jensen-Shannon divergence.” Physical review A 79.5 (2009): 052311. P.-Y. Chen ICML 2019 June 10, 2019 10 / 16

  11. FINGER Algorithms for Jensen-Shannon Distance Jensen-Shannon distance computation via FINGER- � H (batch mode): Input: Two graphs G and � G Output: JSdist( G, � G ) 1. Obtain G = G ⊕ � and compute � H ( G ) , � H ( � G ) , and � G H ( G ) via 2 FINGER (Fast) 2. JSdist( G, � G ) = � H ( G ) − 1 2 [ � H ( G ) + � H ( � G )] ⇒ O ( n + m ) complexity inherited from � H Jensen-Shannon distance computation via FINGER- � H (online mode): Input: Graph G and its changes ∆ G , Approx VNGE � H ( G ) of G Output: JSdist( G, G ⊕ ∆ G ) 1. compute � H ( G ⊕ ∆ G 2 ) and � H ( G ⊕ ∆ G ) via FINGER (Inc.) 2. JSdist( G, G ⊕ ∆ G ) = � 2 [ � H ( G ) + � H ( G ⊕ ∆ G 2 ) − 1 H ( G ⊕ ∆ G )] ⇒ O (∆ n + ∆ m ) complexity inherited from � H √ o ( ln n ) approximation guarantee of JSdist via FINGER (see paper) P.-Y. Chen ICML 2019 June 10, 2019 11 / 16

Recommend


More recommend