Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, - PowerPoint PPT Presentation

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh

Goal: Large scale linear systems • Solve 𝐵𝑦 = 𝑐 • 𝐵 is huge, need 𝑃 𝑜 solution! • Some applications: 𝜖 2 𝑣 𝜖𝑦 2 + 𝜖 2 𝑣 • Discretization of PDEs 𝜖𝑧 2 = 𝑔 𝑦, 𝑧 • Sparse graph analysis

Efficient linear solvers • Decades of research on efficient iterative solvers for large- scale systems • We focus on Algebraic Multigrid (AMG) solvers • Can we use machine learning to improve AMG solvers? • Follow-up to Greenfeld et al. (2019) on Geometric Multigrid

What AMG does • AMG works by successively coarsening the system of equations, and solving on multiple scales • Prolongation operator 𝑄 that creates the hierarchy • We want to learn a mapping 𝑄 𝜄 𝐵 with fast convergence

Learning 𝑄 • Unsupervised loss function over distribution 𝒠 : min 𝜄 𝔽 𝐵~𝒠 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵 • 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵 measures the convergence factor of the solver • 𝑄 𝜄 𝐵 is a NN mapping system 𝐵 to prolongation operator 𝑄

Graph neural network • Sparse matrices can be represented as graphs – we use a Graph Neural Network as the mapping 𝑄 𝜄 𝐵 5 6 2.7 −0.5 −0.5 0 −1.7 0 0 −0.5 7.7 −4.9 −0.6 0 0 −1.7 −0.5 −4.9 6.2 0 0 −0.8 0 4 1 0 −0.6 0 2.9 −0.6 0 −1.7 −1.7 0 0 −0.6 13.1 −10.8 0 0 0 −0.8 0 −10.8 11.6 0 0 −1.7 0 −1.7 0 0 3.4 7 2 3

Benefits of our approach • Unsupervised training – rely on algebraic properties • Generalization – learn general rules for wide class of problems • Efficient training – Fourier analysis reduces computational burden

Sample result; lower is better, ours is lower! Finite Element PDE

Outline • Overview of AMG • Learning objective • Graph neural network • Results

1 st ingredient of AMG: Relaxation • System of equations: 𝑏 𝑗1 𝑦 1 + 𝑏 𝑗2 𝑦 2 + ⋯ 𝑏 𝑗𝑜 𝑦 𝑜 = 𝑐 𝑗 1 • Rearrange: 𝑦 𝑗 = 𝑏 𝑗𝑗 𝑐 𝑗 − σ 𝑘≠𝑗 𝑏 𝑗𝑘 𝑦 𝑘 (0) • Start with an initial guess 𝑦 𝑗 (𝑙+1) = 1 (𝑙) • Iterate until convergence: 𝑦 𝑗 𝑏 𝑗𝑗 𝑐 𝑗 − σ 𝑘≠𝑗 𝑏 𝑗𝑘 𝑦 𝑘

Relaxation smooths the error • Since relaxation is a local procedure, its effect is to smooth out the error • How to accelerate relaxation by dealing with low-frequency errors?

2 nd ingredient of AMG: Coarsening • Smooth error, and then coarsen Relax Coarsen • Error is no longer smooth on coarse grid; relaxation is fast again!

Putting it all together Relaxation (smoothing) Smaller Error on original problem Error on original problem Restriction Prolongation Error approximated on coarsened problem

Learning objective

Prolongation operator • Focus of AMG is prolongation operator 𝑄 for defining scales and moving between them • 𝑄 needs to be sparse for efficiency, but also approximate well smooth errors

Learning 𝑄 • Quality can be quantified by estimating by how much the error is reduced each iteration: • 𝑓 (𝑙+1) = 𝑁 𝐵, 𝑄 𝑓 (𝑙) • 𝑁 𝐵, 𝑄 = 𝑇 𝐽 − 𝑄 𝑄 𝑈 𝐵𝑄 −1 𝑄 𝑈 𝐵 𝑇 • Asymptotically: ‖𝑓 (𝑙+1) ‖ ≈ 𝜍 𝑁 ‖𝑓 (𝑙) ‖ • Spectral radius: 𝜍 𝑁 = max 𝜇 1 , … , 𝜇 𝑜 • Our learning objective: min 𝜄 𝔽 𝐵~𝒠 𝜍 𝑁 𝐵, 𝑄 𝜄 𝐵

Graph neural network

Representing 𝑄 𝜄 • Sparse matrix 𝐵 ∈ ℝ 𝑜×𝑜 to sparse matrix 𝑄 ∈ ℝ 𝑜×𝑜 𝑑 • Mapping should be efficient • Matrices can be represented as graphs with edge weights

Representing 𝑄 𝜄 1 4 6 1 0 0 1 0 .3 0.7 0 2 0.9 0 0.1 2.7 −0.5 −0.5 0 −1.7 0 0 3 −0.5 7.7 −4.9 −0.6 0 0 −1.7 0 1 0 −0.5 −4.9 6.2 0 0 −0.8 0 4 0 0.2 0.8 0 −0.6 0 2.9 −0.6 0 −1.7 5 −1.7 0 0 −0.6 13.1 −10.8 0 0 0 1 0 0 −0.8 0 −10.8 11.6 0 6 0 1 0 0 −1.7 0 −1.7 0 0 3.4 7 5 5 5 6 6 6 4 4 1 4 1 1 7 7 7 2 3 2 3 2 3 Sparsity Output 𝑄 Input 𝐵 pattern

GNN architecture • Message Passing architectures can handle any graph, and have 𝑃 𝑜 runtime 5 5 6 4 1 1 2 7 2 3 3 • Graph Nets framework from Battaglia et al. (2018) generalize many MP variants, handle edge features

Results

Spectral clustering • Bottleneck is an iterative eigenvector algorithm that uses a linear solver • Evaluate number of iterations required to reach convergence • Train network on dataset of small 2D clusters, test on various 2D and 3D distributions

Conclusion • Algebraic Multigrid is an effective 𝑷 𝒐 solver for a wide class of linear systems 𝐵𝑦 = 𝑐 • Main challenge in AMG is constructing prolongation operator 𝑸 , which controls how information is passed between grids • We use an 𝑃 𝑜 , edge-based GNN to learn a mapping 𝑄 𝜄 𝐵 , without supervision • GNN generalizes to larger problems, with different distributions of sparsity pattern and elements

Take home messages • In a well-developed field, might make sense to apply ML to a part of the algorithm • Graph neural networks can be an effective tool for learning sparse linear systems

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, - PowerPoint PPT Presentation

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh Goal: Large scale linear systems Solve = is huge, need solution! Some applications:

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Algebraic multigrid methods for mechanical engineering applications Mark F. Adams St.

Algebraic multigrid in PETSc Mark Adams Lawrence Berkeley National Laboratory PETSc user

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

REVOLUTIONIZING LATTICE QCD PHYSICS WITH HETEROGENEOUS MULTIGRID Kate Clark, April 6th 2016

Compact Fourier Analysis for Multigrid Methods Cortona 2008 Thomas Huckle joint work with

Graph Neural Network Fang Yuanqiang, 2019/05/18 Graph Neural Network Why GNN? Preliminary

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

A massivelly parallel multigrid solver using PETSc for unstructured meshes on Tier0

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

CHAPTER VI VI CHAPTER Learning in Feedforward Feedforward Learning in Neural Networks Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Informed Search (Ch. 3.5-3.6) Announcements Test next week: will cover HW 1 & 2 -Ch 1-3 (up

Graphs II - Shortest paths Single Source Shortest Paths All Sources Shortest Paths some drawings

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

18. Rounding and relaxation Decision problems Easy and hard examples Performance and

Lagrangean relaxation Han Hoogeveen, Utrecht University Basics Situation: you have a nice

Just Relax Convex Programming Methods for Subset Selection and Sparse Approximation Joel A.

On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3

Relaxation of isolated IFIMAR (CONICET-UNMdP) Mar del Plata, Argentina quantum systems School

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, - PowerPoint PPT Presentation

Learning Algebraic Multigrid using Graph Neural Networks Ilay Luz, Meirav Galun, Haggai Maron, Ronen Basri, Irad Yavneh Goal: Large scale linear systems Solve = is huge, need solution! Some applications:

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Algebraic multigrid methods for mechanical engineering applications Mark F. Adams St.

Algebraic multigrid in PETSc Mark Adams Lawrence Berkeley National Laboratory PETSc user

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

REVOLUTIONIZING LATTICE QCD PHYSICS WITH HETEROGENEOUS MULTIGRID Kate Clark, April 6th 2016

Compact Fourier Analysis for Multigrid Methods Cortona 2008 Thomas Huckle joint work with

Graph Neural Network Fang Yuanqiang, 2019/05/18 Graph Neural Network Why GNN? Preliminary

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

A massivelly parallel multigrid solver using PETSc for unstructured meshes on Tier0

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

CHAPTER VI VI CHAPTER Learning in Feedforward Feedforward Learning in Neural Networks Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Informed Search (Ch. 3.5-3.6) Announcements Test next week: will cover HW 1 &amp; 2 -Ch 1-3 (up

Graphs II - Shortest paths Single Source Shortest Paths All Sources Shortest Paths some drawings

Submodular Maximization Seffi Naor Lecture 3 4th Cargese Workshop on Combinatorial Optimization

18. Rounding and relaxation Decision problems Easy and hard examples Performance and

Lagrangean relaxation Han Hoogeveen, Utrecht University Basics Situation: you have a nice

Just Relax Convex Programming Methods for Subset Selection and Sparse Approximation Joel A.

On Partial Optimality in Multi-label MRFs P. Kohli 1 A. Shekhovtsov 2 C. Rother 1 V. Kolmogorov 3

Relaxation of isolated IFIMAR (CONICET-UNMdP) Mar del Plata, Argentina quantum systems School

Informed Search (Ch. 3.5-3.6) Announcements Test next week: will cover HW 1 & 2 -Ch 1-3 (up