Solving large scale eigenvalue problems Lecture 12, May 16, 2018: - PowerPoint PPT Presentation

Solving large scale eigenvalue problems Solving large scale eigenvalue problems Lecture 12, May 16, 2018: Rayleigh quotient minimization http://people.inf.ethz.ch/arbenz/ewp/ Peter Arbenz Computer Science Department, ETH Z¨ urich E-mail: arbenz@inf.ethz.ch Large scale eigenvalue problems, Lecture 12, May 16, 2018 1/37

Solving large scale eigenvalue problems Survey Survey of today’s lecture ◮ Rayleigh quotient minimization ◮ Method of steepest descent ◮ Conjugate gradient algorithm ◮ Preconditioned conjugate gradient algorithm ◮ Locally optimal PCG (LOPCG) ◮ Locally optimal block PCG (LOBPCG) Large scale eigenvalue problems, Lecture 12, May 16, 2018 2/37

Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient We consider symmetric/Hermitian eigenvalue problem M = M ∗ > 0 . A = A ∗ , A ① = λ M ① , The Rayleigh quotient is defined as ρ ( ① ) = ① ∗ A ① ① ∗ M ① . We want to exploit that λ 1 = min ① � = 0 ρ ( ① ) (1) and λ k = min S k ⊂ R n max ρ ( ① ) ① � = 0 ① ∈ S k where S k is a subspace of dimension k . Large scale eigenvalue problems, Lecture 12, May 16, 2018 3/37

Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization Want to construct sequence { ① k } k =0 , 1 ,... such that ρ ( ① k +1 ) < ρ ( ① k ) for all k . The hope is that the sequence { ρ ( ① k ) } converges to λ 1 and by consequence the vector sequence { ① k } towards the corresponding eigenvector. Procedure: For any given ① k we choose a search direction ♣ k s.t. ① k +1 = ① k + δ k ♣ k . Parameter δ k determined s.t. Rayleigh quotient of ① k +1 is minimal: ρ ( ① k +1 ) = min δ ρ ( ① k + δ ♣ k ) . Large scale eigenvalue problems, Lecture 12, May 16, 2018 4/37

Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization (cont.) ① ∗ k A ① k + 2 δ ① ∗ k A ♣ k + δ 2 ♣ ∗ k A ♣ k ρ ( ① k + δ ♣ k ) = ① ∗ k M ① k + 2 δ ① ∗ k M ♣ k + δ 2 ♣ ∗ k M ♣ k � 1 � ∗ � ① ∗ � � 1 � ① ∗ k A ① k k A ♣ k ♣ ∗ ♣ ∗ δ k A ① k k A ♣ k δ = � . � 1 � ∗ � ① ∗ � � 1 ① ∗ k M ① k k M ♣ k ♣ ∗ ♣ ∗ δ k M ① k k M ♣ k δ This is Rayleigh quotient associated with eigenvalue problem � ① ∗ � � α � � ① ∗ � � α � ① ∗ ① ∗ k A ① k k A ♣ k k M ① k k M ♣ k = λ . (2) ♣ ∗ ♣ ∗ ♣ ∗ ♣ ∗ k A ① k k A ♣ k β k M ① k k M ♣ k β Smaller of the two eigenvalues of (2) is the searched value ρ k +1 := ρ ( ① k +1 ) that minimizes the Rayleigh quotient. Large scale eigenvalue problems, Lecture 12, May 16, 2018 5/37

Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization (cont.) We normalize corresponding eigenvector such that its first component equals one. (Is this always possible?) Second component of eigenvector is δ = δ k . Second line of (2) gives ♣ ∗ k A ( ① k + δ k ♣ k ) = ρ k +1 ♣ ∗ k M ( ① k + δ k ♣ k ) or ♣ ∗ k ( A − ρ k +1 M )( ① k + δ k ♣ k ) = ♣ ∗ k r k +1 = 0 . (3) ‘Next’ residual r k +1 is orthogonal to actual search direction ♣ k . How shall we choose the search directions ♣ k ? Large scale eigenvalue problems, Lecture 12, May 16, 2018 6/37

Solving large scale eigenvalue problems The method of steepest descent Detour: steepest descent method for linear systems We consider linear systems A ① = ❜ , (4) where A is SPD (or HPD). We define the functional ϕ ( ① ) ≡ 1 2 ① ∗ A ① − ① ∗ ❜ + 1 2 ❜ ∗ A − 1 ❜ = 1 2( A ① − ❜ ) ∗ A − 1 ( A ① − ❜ ) . ϕ is minimized (actually zero) at the solution ① ∗ of (4). The negative gradient of ϕ is − ∇ ϕ ( ① ) = ❜ − A ① =: r ( ① ) . (5) This is the direction in which ϕ decreases the most. Clearly, ∇ ϕ ( ① ) � = 0 ⇐ ⇒ ① � = ① ∗ . Large scale eigenvalue problems, Lecture 12, May 16, 2018 7/37

Solving large scale eigenvalue problems The method of steepest descent Steepest descent method for eigenvalue problem We choose ♣ k to be the negative gradient of the Rayleigh quotient 2 ♣ k = − ❣ k = −∇ ρ ( ① k ) = − ( A ① k − ρ ( ① k ) M ① k ) . ① ∗ k M ① k Since we only care about directions we can equivalently set ♣ k = r k = A ① k − ρ k M ① k , ρ k = ρ ( ① k ) . With this choice of search direction we have from (3) r ∗ k r k +1 = 0 . (6) The method of steepest descent often converges slowly, as for linear systems. This happens if the spectrum is very much spread out, i.e., if the condition number of A relative to M is big. Large scale eigenvalue problems, Lecture 12, May 16, 2018 8/37

Solving large scale eigenvalue problems The method of steepest descent Slow convergence of steepest descent method Picture: M. Gutknecht Large scale eigenvalue problems, Lecture 12, May 16, 2018 9/37

Solving large scale eigenvalue problems The conjugate gradient algorithm Detour: Conjugate gradient algorithm for linear systems As with linear systems of equations a remedy against the slow convergence of steepest descent are conjugate search directions. In the cg algorithm, we define search directions as ♣ k = − ❣ k + β k ♣ k − 1 , k > 0 . (7) where coefficient β k is determined s.t. ♣ k and ♣ k − 1 are conjugate: ♣ ∗ k A ♣ k − 1 = − ❣ ∗ k A ♣ k − 1 + β k ♣ ∗ k − 1 A ♣ k − 1 = 0 , So, ❣ ∗ ❣ ∗ k A ♣ k − 1 k ❣ k β k = = · · · = . (8) ♣ ∗ ❣ ∗ k − 1 A ♣ k − 1 k − 1 ❣ k − 1 One can show that ♣ ∗ k A ♣ j = ❣ ∗ k ❣ j = 0 for j < k . Large scale eigenvalue problems, Lecture 12, May 16, 2018 10/37

Solving large scale eigenvalue problems The conjugate gradient algorithm The conjugate gradient algorithm The conjugate gradient algorithm can be adapted to eigenvalue problems. The idea is straightforward: consecutive search directions must satisfy ♣ ∗ k A ♣ k − 1 = 0. The crucial difference to linear systems stems from the fact, that the functional that is to be minimized, i.e., the Rayleigh quotient, is not quadratic anymore. (E.g., there is no finite termination property.) The gradient of ρ ( ① ) is 2 ❣ = ∇ ρ ( ① k ) = ① ∗ M ① ( A ① − ρ ( ① ) M ① ) . Large scale eigenvalue problems, Lecture 12, May 16, 2018 11/37

Solving large scale eigenvalue problems The conjugate gradient algorithm The conjugate gradient algorithm (cont.) In the case of eigenvalue problems the different expressions for β k in (7)–(8) are not equivalent anymore. We choose  ♣ 0 = − ❣ 0 , k = 0 ,   ❣ ∗ k M ❣ k (9) ♣ k = − ❣ k + ♣ k − 1 , k > 0 ,  ❣ ∗  k − 1 M ❣ k − 1 which is the best choice according to Feng and Owen [1]. The above formulae is for the generalized eigenvalue problem A ① = λ M ① . Large scale eigenvalue problems, Lecture 12, May 16, 2018 12/37

Solving large scale eigenvalue problems The conjugate gradient algorithm The Rayleigh quotient algorithm 1: Let ① 0 be a unit vector, � ① 0 � M = 1. ρ 0 := ✈ ∗ 0 ① 0 2: ✈ 0 := A ① 0 , ✉ 0 := M ① 0 , 0 ① 0 , ❣ 0 := 2( ✈ 0 − ρ 0 ✉ 0 ) ✉ ∗ 3: while � ❣ k � > tol do if k = 1 then 4: ♣ k := − ❣ k − 1 ; 5: else 6: ♣ k := − ❣ k − 1 + ❣ ∗ k − 1 M ❣ k − 1 7: k − 2 M ❣ k − 2 ♣ k − 1 ; ❣ ∗ end if 8: Determine smallest Ritz value and associated Ritz vector ① k of 9: ( A , M ) in R ([ ① k − 1 , ♣ k ]) ✈ k := A ① k , ✉ k := M ① k 10: ρ k := ① ∗ k ✈ k / ① ∗ k ✉ k 11: ❣ k := 2( ✈ k − ρ k ✉ k ) 12: 13: end while Large scale eigenvalue problems, Lecture 12, May 16, 2018 13/37

Solving large scale eigenvalue problems The conjugate gradient algorithm Convergence Construction of algorithm guarantees that ρ ( ① k +1 ) < ρ ( ① k ) unless r k = 0 , in which case ① k is the searched eigenvector. In general, i.e., if the initial vector ① 0 has a nonvanishing component in the direction of the ‘smallest’ eigenvector ✉ 1 , convergence is toward the smallest eigenvalue λ 1 . Let ① k = cos ϑ k ✉ 1 + sin ϑ k ③ k =: cos ϑ k ✉ 1 + ✇ k , (10) where � ① k � M = � ✉ 1 � M = � ③ k � M = 1 and ✉ ∗ 1 M ③ k = 0. Then we have ρ ( ① k ) = cos 2 ϑ k λ 1 + 2 cos ϑ k sin ϑ k ✉ ∗ 1 A ③ k + sin 2 ϑ k ③ ∗ k A ③ k = λ 1 (1 − sin 2 ϑ k ) + sin 2 ϑ k ρ ( ③ k ) , Large scale eigenvalue problems, Lecture 12, May 16, 2018 14/37

Solving large scale eigenvalue problems The conjugate gradient algorithm Convergence (cont.) Thus, ρ ( ① k ) − λ 1 = sin 2 ϑ k ( ρ ( ③ k ) − λ 1 ) ≤ ( λ n − λ 1 ) sin 2 ϑ k . As seen earlier, in symmetric eigenvalue problems, the eigenvalues are much more accurate than the eigenvectors. Let us suppose that eigenvalue has converged, ρ ( ① k ) = ρ k ∼ = λ 1 , but the eigenvector is not yet as accurate as desired. Then, n � r k = ( A − ρ k M ) ① k ∼ ( λ j − λ 1 ) M ✉ j ✉ ∗ = ( A − λ 1 M ) ① k = j M ① k j =1 n � ( λ j − λ 1 ) M ✉ j ✉ ∗ = j M ① k , j =2 Large scale eigenvalue problems, Lecture 12, May 16, 2018 15/37

Solving large scale eigenvalue problems Lecture 12, May 16, 2018: - PowerPoint PPT Presentation

Solving large scale eigenvalue problems Solving large scale eigenvalue problems Lecture 12, May 16, 2018: Rayleigh quotient minimization http://people.inf.ethz.ch/arbenz/ewp/ Peter Arbenz Computer Science Department, ETH Z urich E-mail:

Solving large scale eigenvalue problems Lecture 11, May 9, 2018: JacobiDavidson algorithms

Solving large scale eigenvalue problems Lecture 10, May 2, 2018: More on Lanczos and Arnoldi

Solving large scale eigenvalue problems Lecture 9, April 25, 2018: Lanczos and Arnoldi methods

Solving large scale eigenvalue problems Lecture 4, March 14, 2018: The QR algorithm

Solving large scale eigenvalue problems Lecture 1, Feb 21, 2018: Introduction

Solving large scale eigenvalue problems Lecture 6, March 28, 2018: Simple vector iterations

Numerical Methods for Solving Large Scale Eigenvalue Problems Lecture 2, February 28, 2018:

Solving large scale eigenvalue problems Lecture 3, March 7, 2018: Newton methods

Solving large scale eigenvalue problems Lecture 5, March 23, 2016: The QR algorithm II

Solving large scale eigenvalue problems Lecture 8, April 18, 2018: Krylov spaces

ECS 231 Gradient descent methods for solving large scale eigenvalue problems 1 / 17 Generalized

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

Solving Word Problems The strategy for solving word problems, presented in written form, may be

Matrix-Eigenvalue Problems in Stochastic Structural Dynamics S Adhikari Department of Aerospace

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Solving Percent Problems Word Problems Find a Pattern Estimation Problems Fraction Problems

Subdirect products of groups Groups and computation: in honour of Paul Schupp Chuck Miller (also

Numerical Solutions to Partial Differential Equations Zhiping Li LMAM and School of Mathematical

CSCI 2132 Software Development Lecture 14: Software Development Life Cycle Instructor: Vlado

Numb3rs 0 12 1 11 2 10 3 Z = { ..., -2, -1, 0, 1, 2, ... } 9 4 8 5 N = { 0, 1, 2, ... }

Prt qtt sts t t ssts

lecture 7 Sequential circuits 3 - integer multiplication and division - floating point

Self-dual codes from orbit matrices and quotient matrices of combinatorial designs Nina Mostarac

Similarity quotients as final coalgebras Paul Blain Levy University of Birmingham February 3,