GraphBLAS: A linear algebraic approach for high-performance graph algorithms Gábor Szárnyas szarnyas@mit.bme.hu
WHAT MAKES GRAPH PROCESSING DIFFICULT? the “curse of connectedness” connectedness contemporary computer architectures are good at computer processing linear and hierarchical data structures, architectures such as Lists , Stacks , or Trees a massive amount of random data access is required, caching and CPU has frequent cache misses, and implementing parallelization parallelism is difficult B. Shao, Y. Li, H. Wang, H. Xia (Microsoft Research), Trinity Graph Engine and its Applications, IEEE Data Engineering Bulleting 2017
Graph processing in linear algebra
ADJACENCY MATRIX 𝐁 𝑗𝑘 = ൝ 1 if (𝑤 𝑗 , 𝑤 𝑘 ) ∈ 𝐹 0 if (𝑤 𝑗 , 𝑤 𝑘 ) ∉ 𝐹 𝐁 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1
ADJACENCY MATRIX Most cells are zero: 𝐁 𝑗𝑘 = ൝ 1 if (𝑤 𝑗 , 𝑤 𝑘 ) ∈ 𝐹 sparse matrix 0 if (𝑤 𝑗 , 𝑤 𝑘 ) ∉ 𝐹 target 𝐁 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 source 0 0 1 1 0 0 1
ADJACENCY MATRIX Most cells are zero: 𝐁 𝑗𝑘 = ൝ 1 if (𝑤 𝑗 , 𝑤 𝑘 ) ∈ 𝐹 sparse matrix 0 if (𝑤 𝑗 , 𝑤 𝑘 ) ∉ 𝐹 target 𝐁 1 1 1 1 1 1 1 1 1 source 1 1 1
GRAPH TRAVERSAL WITH MATRIX MULTIPLICATION Use vector/matrix operations to express graph algorithms: 𝐰𝐁 𝑙 means 𝑙 hops in the graph 𝐁 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 𝐰 one-hop: 𝐰𝐁
GRAPH TRAVERSAL WITH MATRIX MULTIPLICATION Use vector/matrix operations to express graph algorithms: 𝐰𝐁 𝑙 means 𝑙 hops in the graph 𝐁 𝐁 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 𝐰 one-hop: 𝐰𝐁 two-hop: 𝐰𝐁 𝟑
BOOKS ON LINEAR ALGEBRA FOR GRAPH PROCESSING 1974: Aho-Hopcroft-Ullman book o The Design and Analysis of Computer Algorithms 1990: Cormen-Leiserson-Rivest book o Introduction to Algorithms 2011: GALLA book (ed. Kepner and Gilbert) o Graph Algorithms in the Language of Linear Algebra A lot of literature but few practical implementations and particularly few easy-to-use libraries.
THE GRAPHBLAS STANDARD Goal: separate the concerns of the hardware/library/application designers. 1979: BLAS Basic Linear Algebra Subprograms (dense) 2001: Sparse BLAS an extension to BLAS (insufficient for graphs, little uptake) 2013: GraphBLAS standard building blocks for graph algorithms in LA Numerical applications Graph analytical apps LINPACK/LAPACK LAGraph Separation of concerns Separation of concerns BLAS GraphBLAS Hardware architecture Hardware architecture
Semiring-based graph computations
MATRIX MULTIPLICATION Definition: 𝐂 2 · 5 = 10 5 𝐃 = 𝐁𝐂 3 · 4 = 12 4 𝐃 𝑗, 𝑘 = Σ 𝑙 𝐁 𝑗, 𝑙 ⋅ 𝐂 𝑙, 𝑘 𝐁 Example: 2 3 22 10 + 12 = 22 𝐃 2,3 = 𝐁 2,1 ⋅ 𝐂 1,3 + 𝐁 2,2 ⋅ 𝐂 2,3 = 2 ⋅ 5 + 3 ⋅ 4 = 22 𝐃 = 𝐁 ⋅ 𝐂
MATRIX MULTIPLICATION ON SEMIRINGS Using the conventional semiring 𝐃 = 𝐁𝐂 𝐃 𝑗, 𝑘 = Σ 𝑙 𝐁 𝑗, 𝑙 ⋅ 𝐂 𝑙, 𝑘 Use arbitrary semirings that override the ⨁ addition and ⨂ multiplication operators. Generalized formula (simplified) 𝐃 = 𝐁 ⨁ . ⨂ 𝐂 𝐃 𝑗, 𝑘 = ⊕ 𝑙 𝐁 𝑗, 𝑙 ⨂𝐂 𝑙, 𝑘
GRAPHBLAS SEMIRINGS The 𝐸,⊕,⊗, 0 algebraic structure is a GraphBLAS semiring if 𝐸,⊕, 0 is a commutative monoid over domain 𝐸 with an addition operator ⊕ and identity 0 , where ∀𝑏, 𝑐, 𝑑 ∈ 𝐸 : o Commutative 𝑏 ⊕ 𝑐 = 𝑐 ⊕ 𝑏 o Associative 𝑏 ⊕ 𝑐 ⊕ 𝑑 = 𝑏 ⊕ 𝑐 ⊕ 𝑑 o Identity 𝑏 ⊕ 0 = 𝑏 The multiplication operator is a closed binary operator ⊗: 𝐸 × 𝐸 → 𝐸 . This is less strict than the standard mathematical definition which requires that ⊗ is a monoid and distributes over ⊕ .
COMMON SEMIRINGS semiring domain ⨁ ⨂ 0 integer arithmetic 𝑏 ∈ ℕ + ⋅ 0 real arithmetic 𝑏 ∈ ℝ + ⋅ 0 lor-land 𝑏 ∈ F, T ⋁ ⋀ F Galois field xor 0 𝑏 ∈ 0,1 ⋀ power set 𝑏 ⊂ ℤ ∪ ∩ ∅ Notation: 𝐁 ⊕.⊗ 𝐂 is a matrix multiplication using addition ⊕ and multiplication ⊗ , e.g. 𝐁 ∨.∧ 𝐂 . The default is 𝐁 + . ⋅ 𝐂
MATRIX MULTIPLICATION SEMANTICS semiring domain 0 ⨁ ⨂ integer arithmetic 𝑏 ∈ ℕ + ⋅ 0 𝐁 1 1 Semantics: number of paths 1 1 1·1=1 1 1 1 1 1·1=1 1 1 1 1 0 0 0 1 0 1 0 1 2 𝐰 1+1=2 𝐰 ⊕.⊗ 𝐁
MATRIX MULTIPLICATION SEMANTICS semiring domain 0 ⨁ ⨂ lor-land 𝑏 ∈ F, T ∨ ∧ F 𝐁 T T Semantics: reachability T T T T ∧ T=T T T T ∧ T=T T T T T T F F F T F T F T T 𝐰 T ∨ T=T Identity element: F 𝐰 ∨.∧ 𝐁
MATRIX MULTIPLICATION SEMANTICS semiring domain 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 1 1 Semantics: shortest path 1 1 0.5+0.4=0.9 1 .2 .4 0.6+0.5=1.1 1 .2 .5 1 1 1 .5 𝐰 ∞ ∞ ∞ .5 ∞ .6 ∞ .7 .9 .4 .6 min(0.9,1.1)=0.9 𝐰 min . + 𝐁 .5
Graph algorithms in GraphBLAS Single-source shortest path
SSSP – SINGLE-SOURCE SHORTEST PATHS Problem: o From a given start node 𝑡 , find the shortest paths to every other (reachable) node in the graph Bellman-Ford algorithm: o Relaxes all edges in each step o Guaranteed to find the shortest paths using at most 𝑜 − 1 steps Observation: o The relaxation step can be captured using a VM multiplication
SSSP – ALGEBRAIC BELLMAN-FORD We use the min-plus semiring with identity ∞ . if 𝑗 = 𝑘 𝐁 0 0 .3 ∞ .8 ∞ ∞ ∞ if 𝑓 𝑗𝑘 ∈ 𝐹 .3 𝑥 𝑓 𝑗𝑘 𝐁 𝑗𝑘 = ൞ ∞ 0 ∞ ∞ .1 ∞ .7 if 𝑓 𝑗𝑘 ∉ 𝐹 ∞ ∞ ∞ 0 ∞ ∞ .5 ∞ .2 ∞ .4 0 ∞ ∞ ∞ .8 .1 .2 .7 ∞ ∞ ∞ ∞ 0 .1 ∞ 𝐞 = ∞ ∞ … ∞ ∞ ∞ .5 ∞ ∞ 0 ∞ .8 𝐞 𝑡 = 0 .5 ∞ ∞ .1 .5 .9 ∞ 0 .1 .4 𝐞 0 ∞ ∞ ∞ ∞ ∞ ∞ .5 .1 𝐞 min.+ 𝐁 .5
SSSP – ALGEBRAIC BELLMAN-FORD semiring set 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 0 .3 .8 .3 0 .1 .7 0 .5 .2 .4 0 .8 .1 .2 .7 0 .1 .5 0 .8 .5 .1 .5 .9 0 .1 .4 𝐞 0 ∞ ∞ ∞ ∞ ∞ ∞ .5 .1 𝐞 min.+ 𝐁 .5
SSSP – ALGEBRAIC BELLMAN-FORD semiring set 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 0 .3 .8 .3 0 .1 .7 0 .5 .2 .4 0 .8 .1 .2 .7 0 .1 .5 0 .8 .5 .1 .5 .9 0 .1 .4 𝐞 0 ∞ ∞ ∞ ∞ ∞ ∞ 0 .3 .8 .5 .1 𝐞 min.+ 𝐁 .5
SSSP – ALGEBRAIC BELLMAN-FORD semiring set 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 0 .3 .8 .3 0 .1 .7 0 .5 .2 .4 0 .8 .1 .2 .7 0 .1 .5 0 .8 .5 .1 .5 .9 0 .1 .4 𝐞 0 .3 ∞ .8 ∞ ∞ ∞ 0 .3 1 .2 .8 .4 1 .5 .1 𝐞 min.+ 𝐁 .5
SSSP – ALGEBRAIC BELLMAN-FORD semiring set 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 0 .3 .8 .3 0 .1 .7 0 .5 .2 .4 0 .8 .1 .2 .7 0 .1 .5 0 .8 .5 .1 .5 .9 0 .1 .4 𝐞 0 .3 1 .2 .8 .4 ∞ 1 0 .3 1.1.8 .4 .5 1 .5 .1 𝐞 min.+ 𝐁 .5
SSSP – ALGEBRAIC BELLMAN-FORD semiring set 0 ⨁ ⨂ min-plus 𝑏 ∈ ℝ ∪ ∞ min + ∞ 𝐁 0 .3 .8 .3 0 .1 .7 0 .5 .2 .4 0 .8 .1 .2 .7 0 .1 .5 0 .8 .5 .1 .5 .9 0 .1 .4 𝐞 0 .3 1.1.8 .4 .5 1 0 .3 1 .8 .4 .5 1 .5 .1 𝐞 min.+ 𝐁 .5
Recommend
More recommend