learning algebraic multigrid using graph neural networks
play

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, - PowerPoint PPT Presentation

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh Goal: Large scale linear systems Solve = is huge, need solution! Some applications:


  1. Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh

  2. Goal: Large scale linear systems • Solve 𝐵𝑦 = 𝑐 • 𝐵 is huge, need 𝑃 𝑜 solution! • Some applications: 𝜖 2 𝑣 𝜖𝑦 2 + 𝜖 2 𝑣 • Discretization of PDEs 𝜖𝑧 2 = 𝑔 𝑦, 𝑧 • Sparse graph analysis

  3. Efficient linear solvers • Decades of research on efficient iterative solvers for large- scale systems • We focus on Algebraic Multigrid (AMG) solvers • Can we use machine learning to improve AMG solvers? • Follow-up to Greenfeld et al. (2019) on Geometric Multigrid

  4. What AMG does • AMG works by successively coarsening the system of equations, and solving on multiple scales • Prolongation operator 𝑄 that creates the hierarchy • We want to learn a mapping 𝑄 𝜄 𝐵 with fast convergence

  5. Learning 𝑄 • Unsupervised loss function over distribution 𝒠 : min 𝜄 𝔽 𝐵~𝒠 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵 • 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵 measures the convergence factor of the solver • 𝑄 𝜄 𝐵 is a NN mapping system 𝐵 to prolongation operator 𝑄

  6. Graph neural network • Sparse matrices can be represented as graphs – we use a Graph Neural Network as the mapping 𝑄 𝜄 𝐵 5 6 2.7 −0.5 −0.5 0 −1.7 0 0 −0.5 7.7 −4.9 −0.6 0 0 −1.7 −0.5 −4.9 6.2 0 0 −0.8 0 4 1 0 −0.6 0 2.9 −0.6 0 −1.7 −1.7 0 0 −0.6 13.1 −10.8 0 0 0 −0.8 0 −10.8 11.6 0 0 −1.7 0 −1.7 0 0 3.4 7 2 3

  7. Benefits of our approach • Unsupervised training – rely on algebraic properties • Generalization – learn general rules for wide class of problems • Efficient training – Fourier analysis reduces computational burden

  8. Sample result; lower is better, ours is lower! Finite Element PDE

  9. Outline • Overview of AMG • Learning objective • Graph neural network • Results

  10. 1 st ingredient of AMG: Relaxation • System of equations: 𝑏 𝑗1 𝑦 1 + 𝑏 𝑗2 𝑦 2 + ⋯ 𝑏 𝑗𝑜 𝑦 𝑜 = 𝑐 𝑗 1 • Rearrange: 𝑦 𝑗 = 𝑏 𝑗𝑗 𝑐 𝑗 − σ 𝑘≠𝑗 𝑏 𝑗𝑘 𝑦 𝑘 (0) • Start with an initial guess 𝑦 𝑗 (𝑙+1) = 1 (𝑙) • Iterate until convergence: 𝑦 𝑗 𝑏 𝑗𝑗 𝑐 𝑗 − σ 𝑘≠𝑗 𝑏 𝑗𝑘 𝑦 𝑘

  11. Relaxation smooths the error • Since relaxation is a local procedure, its effect is to smooth out the error • How to accelerate relaxation by dealing with low-frequency errors?

  12. 2 nd ingredient of AMG: Coarsening • Smooth error, and then coarsen Relax Coarsen • Error is no longer smooth on coarse grid; relaxation is fast again!

  13. Putting it all together Relaxation (smoothing) Smaller Error on original problem Error on original problem Restriction Prolongation Error approximated on coarsened problem

  14. Learning objective

  15. Prolongation operator • Focus of AMG is prolongation operator 𝑄 for defining scales and moving between them • 𝑄 needs to be sparse for efficiency, but also approximate well smooth errors

  16. Learning 𝑄 • Quality can be quantified by estimating by how much the error is reduced each iteration: • 𝑓 (𝑙+1) = 𝑁 𝐵, 𝑄 𝑓 (𝑙) • 𝑁 𝐵, 𝑄 = 𝑇 𝐽 − 𝑄 𝑄 𝑈 𝐵𝑄 −1 𝑄 𝑈 𝐵 𝑇 • Asymptotically: ‖𝑓 (𝑙+1) ‖ ≈ 𝜍 𝑁 ‖𝑓 (𝑙) ‖ • Spectral radius: 𝜍 𝑁 = max 𝜇 1 , … , 𝜇 𝑜 • Our learning objective: min 𝜄 𝔽 𝐵~𝒠 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵

  17. Graph neural network

  18. Representing 𝑄 𝜄 • Sparse matrix 𝐵 ∈ ℝ 𝑜×𝑜 to sparse matrix 𝑄 ∈ ℝ 𝑜×𝑜 𝑑 • Mapping should be efficient • Matrices can be represented as graphs with edge weights

  19. Representing 𝑄 𝜄 1 4 6 1 0 0 1 0 .3 0.7 0 2 0.9 0 0.1 2.7 −0.5 −0.5 0 −1.7 0 0 3 −0.5 7.7 −4.9 −0.6 0 0 −1.7 0 1 0 −0.5 −4.9 6.2 0 0 −0.8 0 4 0 0.2 0.8 0 −0.6 0 2.9 −0.6 0 −1.7 5 −1.7 0 0 −0.6 13.1 −10.8 0 0 0 1 0 0 −0.8 0 −10.8 11.6 0 6 0 1 0 0 −1.7 0 −1.7 0 0 3.4 7 5 5 5 6 6 6 4 4 1 4 1 1 7 7 7 2 3 2 3 2 3 Sparsity Output 𝑄 Input 𝐵 pattern

  20. GNN architecture • Message Passing architectures can handle any graph, and have 𝑃 𝑜 runtime 5 5 6 4 1 1 2 7 2 3 3 • Graph Nets framework from Battaglia et al. (2018) generalize many MP variants, handle edge features

  21. Results

  22. Spectral clustering • Bottleneck is an iterative eigenvector algorithm that uses a linear solver • Evaluate number of iterations required to reach convergence • Train network on dataset of small 2D clusters, test on various 2D and 3D distributions

  23. Conclusion • Algebraic Multigrid is an effective 𝑷 𝒐 solver for a wide class of linear systems 𝐵𝑦 = 𝑐 • Main challenge in AMG is constructing prolongation operator 𝑸 , which controls how information is passed between grids • We use an 𝑃 𝑜 , edge-based GNN to learn a mapping 𝑄 𝜄 𝐵 , without supervision • GNN generalizes to larger problems, with different distributions of sparsity pattern and elements

  24. Take home messages • In a well-developed field, might make sense to apply ML to a part of the algorithm • Graph neural networks can be an effective tool for learning sparse linear systems

Recommend


More recommend