Solving large scale eigenvalue problems Solving large scale eigenvalue problems Lecture 12, May 16, 2018: Rayleigh quotient minimization http://people.inf.ethz.ch/arbenz/ewp/ Peter Arbenz Computer Science Department, ETH Z¨ urich E-mail: arbenz@inf.ethz.ch Large scale eigenvalue problems, Lecture 12, May 16, 2018 1/37
Solving large scale eigenvalue problems Survey Survey of today’s lecture ◮ Rayleigh quotient minimization ◮ Method of steepest descent ◮ Conjugate gradient algorithm ◮ Preconditioned conjugate gradient algorithm ◮ Locally optimal PCG (LOPCG) ◮ Locally optimal block PCG (LOBPCG) Large scale eigenvalue problems, Lecture 12, May 16, 2018 2/37
Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient We consider symmetric/Hermitian eigenvalue problem M = M ∗ > 0 . A = A ∗ , A ① = λ M ① , The Rayleigh quotient is defined as ρ ( ① ) = ① ∗ A ① ① ∗ M ① . We want to exploit that λ 1 = min ① � = 0 ρ ( ① ) (1) and λ k = min S k ⊂ R n max ρ ( ① ) ① � = 0 ① ∈ S k where S k is a subspace of dimension k . Large scale eigenvalue problems, Lecture 12, May 16, 2018 3/37
Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization Want to construct sequence { ① k } k =0 , 1 ,... such that ρ ( ① k +1 ) < ρ ( ① k ) for all k . The hope is that the sequence { ρ ( ① k ) } converges to λ 1 and by consequence the vector sequence { ① k } towards the corresponding eigenvector. Procedure: For any given ① k we choose a search direction ♣ k s.t. ① k +1 = ① k + δ k ♣ k . Parameter δ k determined s.t. Rayleigh quotient of ① k +1 is minimal: ρ ( ① k +1 ) = min δ ρ ( ① k + δ ♣ k ) . Large scale eigenvalue problems, Lecture 12, May 16, 2018 4/37
Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization (cont.) ① ∗ k A ① k + 2 δ ① ∗ k A ♣ k + δ 2 ♣ ∗ k A ♣ k ρ ( ① k + δ ♣ k ) = ① ∗ k M ① k + 2 δ ① ∗ k M ♣ k + δ 2 ♣ ∗ k M ♣ k � 1 � ∗ � ① ∗ � � 1 � ① ∗ k A ① k k A ♣ k ♣ ∗ ♣ ∗ δ k A ① k k A ♣ k δ = � . � 1 � ∗ � ① ∗ � � 1 ① ∗ k M ① k k M ♣ k ♣ ∗ ♣ ∗ δ k M ① k k M ♣ k δ This is Rayleigh quotient associated with eigenvalue problem � ① ∗ � � α � � ① ∗ � � α � ① ∗ ① ∗ k A ① k k A ♣ k k M ① k k M ♣ k = λ . (2) ♣ ∗ ♣ ∗ ♣ ∗ ♣ ∗ k A ① k k A ♣ k β k M ① k k M ♣ k β Smaller of the two eigenvalues of (2) is the searched value ρ k +1 := ρ ( ① k +1 ) that minimizes the Rayleigh quotient. Large scale eigenvalue problems, Lecture 12, May 16, 2018 5/37
Solving large scale eigenvalue problems Rayleigh quotient Rayleigh quotient minimization (cont.) We normalize corresponding eigenvector such that its first component equals one. (Is this always possible?) Second component of eigenvector is δ = δ k . Second line of (2) gives ♣ ∗ k A ( ① k + δ k ♣ k ) = ρ k +1 ♣ ∗ k M ( ① k + δ k ♣ k ) or ♣ ∗ k ( A − ρ k +1 M )( ① k + δ k ♣ k ) = ♣ ∗ k r k +1 = 0 . (3) ‘Next’ residual r k +1 is orthogonal to actual search direction ♣ k . How shall we choose the search directions ♣ k ? Large scale eigenvalue problems, Lecture 12, May 16, 2018 6/37
Solving large scale eigenvalue problems The method of steepest descent Detour: steepest descent method for linear systems We consider linear systems A ① = ❜ , (4) where A is SPD (or HPD). We define the functional ϕ ( ① ) ≡ 1 2 ① ∗ A ① − ① ∗ ❜ + 1 2 ❜ ∗ A − 1 ❜ = 1 2( A ① − ❜ ) ∗ A − 1 ( A ① − ❜ ) . ϕ is minimized (actually zero) at the solution ① ∗ of (4). The negative gradient of ϕ is − ∇ ϕ ( ① ) = ❜ − A ① =: r ( ① ) . (5) This is the direction in which ϕ decreases the most. Clearly, ∇ ϕ ( ① ) � = 0 ⇐ ⇒ ① � = ① ∗ . Large scale eigenvalue problems, Lecture 12, May 16, 2018 7/37
Solving large scale eigenvalue problems The method of steepest descent Steepest descent method for eigenvalue problem We choose ♣ k to be the negative gradient of the Rayleigh quotient 2 ♣ k = − ❣ k = −∇ ρ ( ① k ) = − ( A ① k − ρ ( ① k ) M ① k ) . ① ∗ k M ① k Since we only care about directions we can equivalently set ♣ k = r k = A ① k − ρ k M ① k , ρ k = ρ ( ① k ) . With this choice of search direction we have from (3) r ∗ k r k +1 = 0 . (6) The method of steepest descent often converges slowly, as for linear systems. This happens if the spectrum is very much spread out, i.e., if the condition number of A relative to M is big. Large scale eigenvalue problems, Lecture 12, May 16, 2018 8/37
Solving large scale eigenvalue problems The method of steepest descent Slow convergence of steepest descent method Picture: M. Gutknecht Large scale eigenvalue problems, Lecture 12, May 16, 2018 9/37
Solving large scale eigenvalue problems The conjugate gradient algorithm Detour: Conjugate gradient algorithm for linear systems As with linear systems of equations a remedy against the slow convergence of steepest descent are conjugate search directions. In the cg algorithm, we define search directions as ♣ k = − ❣ k + β k ♣ k − 1 , k > 0 . (7) where coefficient β k is determined s.t. ♣ k and ♣ k − 1 are conjugate: ♣ ∗ k A ♣ k − 1 = − ❣ ∗ k A ♣ k − 1 + β k ♣ ∗ k − 1 A ♣ k − 1 = 0 , So, ❣ ∗ ❣ ∗ k A ♣ k − 1 k ❣ k β k = = · · · = . (8) ♣ ∗ ❣ ∗ k − 1 A ♣ k − 1 k − 1 ❣ k − 1 One can show that ♣ ∗ k A ♣ j = ❣ ∗ k ❣ j = 0 for j < k . Large scale eigenvalue problems, Lecture 12, May 16, 2018 10/37
Solving large scale eigenvalue problems The conjugate gradient algorithm The conjugate gradient algorithm The conjugate gradient algorithm can be adapted to eigenvalue problems. The idea is straightforward: consecutive search directions must satisfy ♣ ∗ k A ♣ k − 1 = 0. The crucial difference to linear systems stems from the fact, that the functional that is to be minimized, i.e., the Rayleigh quotient, is not quadratic anymore. (E.g., there is no finite termination property.) The gradient of ρ ( ① ) is 2 ❣ = ∇ ρ ( ① k ) = ① ∗ M ① ( A ① − ρ ( ① ) M ① ) . Large scale eigenvalue problems, Lecture 12, May 16, 2018 11/37
Solving large scale eigenvalue problems The conjugate gradient algorithm The conjugate gradient algorithm (cont.) In the case of eigenvalue problems the different expressions for β k in (7)–(8) are not equivalent anymore. We choose ♣ 0 = − ❣ 0 , k = 0 , ❣ ∗ k M ❣ k (9) ♣ k = − ❣ k + ♣ k − 1 , k > 0 , ❣ ∗ k − 1 M ❣ k − 1 which is the best choice according to Feng and Owen [1]. The above formulae is for the generalized eigenvalue problem A ① = λ M ① . Large scale eigenvalue problems, Lecture 12, May 16, 2018 12/37
Solving large scale eigenvalue problems The conjugate gradient algorithm The Rayleigh quotient algorithm 1: Let ① 0 be a unit vector, � ① 0 � M = 1. ρ 0 := ✈ ∗ 0 ① 0 2: ✈ 0 := A ① 0 , ✉ 0 := M ① 0 , 0 ① 0 , ❣ 0 := 2( ✈ 0 − ρ 0 ✉ 0 ) ✉ ∗ 3: while � ❣ k � > tol do if k = 1 then 4: ♣ k := − ❣ k − 1 ; 5: else 6: ♣ k := − ❣ k − 1 + ❣ ∗ k − 1 M ❣ k − 1 7: k − 2 M ❣ k − 2 ♣ k − 1 ; ❣ ∗ end if 8: Determine smallest Ritz value and associated Ritz vector ① k of 9: ( A , M ) in R ([ ① k − 1 , ♣ k ]) ✈ k := A ① k , ✉ k := M ① k 10: ρ k := ① ∗ k ✈ k / ① ∗ k ✉ k 11: ❣ k := 2( ✈ k − ρ k ✉ k ) 12: 13: end while Large scale eigenvalue problems, Lecture 12, May 16, 2018 13/37
Solving large scale eigenvalue problems The conjugate gradient algorithm Convergence Construction of algorithm guarantees that ρ ( ① k +1 ) < ρ ( ① k ) unless r k = 0 , in which case ① k is the searched eigenvector. In general, i.e., if the initial vector ① 0 has a nonvanishing component in the direction of the ‘smallest’ eigenvector ✉ 1 , convergence is toward the smallest eigenvalue λ 1 . Let ① k = cos ϑ k ✉ 1 + sin ϑ k ③ k =: cos ϑ k ✉ 1 + ✇ k , (10) where � ① k � M = � ✉ 1 � M = � ③ k � M = 1 and ✉ ∗ 1 M ③ k = 0. Then we have ρ ( ① k ) = cos 2 ϑ k λ 1 + 2 cos ϑ k sin ϑ k ✉ ∗ 1 A ③ k + sin 2 ϑ k ③ ∗ k A ③ k = λ 1 (1 − sin 2 ϑ k ) + sin 2 ϑ k ρ ( ③ k ) , Large scale eigenvalue problems, Lecture 12, May 16, 2018 14/37
Solving large scale eigenvalue problems The conjugate gradient algorithm Convergence (cont.) Thus, ρ ( ① k ) − λ 1 = sin 2 ϑ k ( ρ ( ③ k ) − λ 1 ) ≤ ( λ n − λ 1 ) sin 2 ϑ k . As seen earlier, in symmetric eigenvalue problems, the eigenvalues are much more accurate than the eigenvectors. Let us suppose that eigenvalue has converged, ρ ( ① k ) = ρ k ∼ = λ 1 , but the eigenvector is not yet as accurate as desired. Then, n � r k = ( A − ρ k M ) ① k ∼ ( λ j − λ 1 ) M ✉ j ✉ ∗ = ( A − λ 1 M ) ① k = j M ① k j =1 n � ( λ j − λ 1 ) M ✉ j ✉ ∗ = j M ① k , j =2 Large scale eigenvalue problems, Lecture 12, May 16, 2018 15/37
Recommend
More recommend