Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh
Goal: Large scale linear systems β’ Solve π΅π¦ = π β’ π΅ is huge, need π π solution! β’ Some applications: π 2 π£ ππ¦ 2 + π 2 π£ β’ Discretization of PDEs ππ§ 2 = π π¦, π§ β’ Sparse graph analysis
Efficient linear solvers β’ Decades of research on efficient iterative solvers for large- scale systems β’ We focus on Algebraic Multigrid (AMG) solvers β’ Can we use machine learning to improve AMG solvers? β’ Follow-up to Greenfeld et al. (2019) on Geometric Multigrid
What AMG does β’ AMG works by successively coarsening the system of equations, and solving on multiple scales β’ Prolongation operator π that creates the hierarchy β’ We want to learn a mapping π π π΅ with fast convergence
Learning π β’ Unsupervised loss function over distribution π : min π π½ π΅~π π π π΅, π π π΅ β’ π π π΅, π π π΅ measures the convergence factor of the solver β’ π π π΅ is a NN mapping system π΅ to prolongation operator π
Graph neural network β’ Sparse matrices can be represented as graphs β we use a Graph Neural Network as the mapping π π π΅ 5 6 2.7 β0.5 β0.5 0 β1.7 0 0 β0.5 7.7 β4.9 β0.6 0 0 β1.7 β0.5 β4.9 6.2 0 0 β0.8 0 4 1 0 β0.6 0 2.9 β0.6 0 β1.7 β1.7 0 0 β0.6 13.1 β10.8 0 0 0 β0.8 0 β10.8 11.6 0 0 β1.7 0 β1.7 0 0 3.4 7 2 3
Benefits of our approach β’ Unsupervised training β rely on algebraic properties β’ Generalization β learn general rules for wide class of problems β’ Efficient training β Fourier analysis reduces computational burden
Sample result; lower is better, ours is lower! Finite Element PDE
Outline β’ Overview of AMG β’ Learning objective β’ Graph neural network β’ Results
1 st ingredient of AMG: Relaxation β’ System of equations: π π1 π¦ 1 + π π2 π¦ 2 + β― π ππ π¦ π = π π 1 β’ Rearrange: π¦ π = π ππ π π β Ο πβ π π ππ π¦ π (0) β’ Start with an initial guess π¦ π (π+1) = 1 (π) β’ Iterate until convergence: π¦ π π ππ π π β Ο πβ π π ππ π¦ π
Relaxation smooths the error β’ Since relaxation is a local procedure, its effect is to smooth out the error β’ How to accelerate relaxation by dealing with low-frequency errors?
2 nd ingredient of AMG: Coarsening β’ Smooth error, and then coarsen Relax Coarsen β’ Error is no longer smooth on coarse grid; relaxation is fast again!
Putting it all together Relaxation (smoothing) Smaller Error on original problem Error on original problem Restriction Prolongation Error approximated on coarsened problem
Learning objective
Prolongation operator β’ Focus of AMG is prolongation operator π for defining scales and moving between them β’ π needs to be sparse for efficiency, but also approximate well smooth errors
Learning π β’ Quality can be quantified by estimating by how much the error is reduced each iteration: β’ π (π+1) = π π΅, π π (π) β’ π π΅, π = π π½ β π π π π΅π β1 π π π΅ π β’ Asymptotically: βπ (π+1) β β π π βπ (π) β β’ Spectral radius: π π = max π 1 , β¦ , π π β’ Our learning objective: min π π½ π΅~π π π π΅, π π π΅
Graph neural network
Representing π π β’ Sparse matrix π΅ β β πΓπ to sparse matrix π β β πΓπ π β’ Mapping should be efficient β’ Matrices can be represented as graphs with edge weights
Representing π π 1 4 6 1 0 0 1 0 .3 0.7 0 2 0.9 0 0.1 2.7 β0.5 β0.5 0 β1.7 0 0 3 β0.5 7.7 β4.9 β0.6 0 0 β1.7 0 1 0 β0.5 β4.9 6.2 0 0 β0.8 0 4 0 0.2 0.8 0 β0.6 0 2.9 β0.6 0 β1.7 5 β1.7 0 0 β0.6 13.1 β10.8 0 0 0 1 0 0 β0.8 0 β10.8 11.6 0 6 0 1 0 0 β1.7 0 β1.7 0 0 3.4 7 5 5 5 6 6 6 4 4 1 4 1 1 7 7 7 2 3 2 3 2 3 Sparsity Output π Input π΅ pattern
GNN architecture β’ Message Passing architectures can handle any graph, and have π π runtime 5 5 6 4 1 1 2 7 2 3 3 β’ Graph Nets framework from Battaglia et al. (2018) generalize many MP variants, handle edge features
Results
Spectral clustering β’ Bottleneck is an iterative eigenvector algorithm that uses a linear solver β’ Evaluate number of iterations required to reach convergence β’ Train network on dataset of small 2D clusters, test on various 2D and 3D distributions
Conclusion β’ Algebraic Multigrid is an effective π· π solver for a wide class of linear systems π΅π¦ = π β’ Main challenge in AMG is constructing prolongation operator πΈ , which controls how information is passed between grids β’ We use an π π , edge-based GNN to learn a mapping π π π΅ , without supervision β’ GNN generalizes to larger problems, with different distributions of sparsity pattern and elements
Take home messages β’ In a well-developed field, might make sense to apply ML to a part of the algorithm β’ Graph neural networks can be an effective tool for learning sparse linear systems
Recommend
More recommend