MATH 612 Computational methods for equation solving and function minimization – Week # 10 F .J.S. Spring 2014 – University of Delaware FJS MATH 612 1 / 43
Plan for this week Discuss any problems you couldn’t solve from previous lectures We will cover Lectures 35, 36, and 38, and be done with Linear Algebra Coding assignment #3 is due next Monday Once again, start thinking about your final coding assignment Remember that... ... I’ll keep on updating, correcting, and modifying the slides until the end of each week. FJS MATH 612 2 / 43
THE ROAD TO GMRES FJS MATH 612 3 / 43
Review (1): Krylov and Arnoldi Let b ∈ C m and A ∈ C m × m . We consider the subspaces K n := � b , Ab , A 2 b , . . . , A n − 1 b � n ≥ 1 . We care about these spaces as long as their dimensions are equal to n . After that all of them are the same. Arnoldi’s method is the modified Gram-Schmidt method applied to find an orthonormal basis of the Krylov space K n . � b , Ab , A 2 b , . . . , A j − 1 b � = � q 1 , . . . , q j � We do not compute the QR decomposition of the Krylov matrix b . A n − 1 b K n = Ab . . . Still we keep score of all the computations in the GS process (the entries of R , except for the first one � b � , which will magically reappear later). FJS MATH 612 4 / 43
Review (2): Krylov and Arnoldi 1 q 1 = � b � 2 b % first step apart for j ≥ 1 % count on the next v = Aq j % the newcomer for i = 1 : j h ij = q ∗ i v % not R anymore v = v − h ij q i end h j + 1 , j = � v � 2 % stop if zero 1 q j + 1 = h j + 1 , j v end Aq j = h 1 j q 1 + h 2 j q 2 + . . . + h jj q j + h j + 1 , j q j + 1 � �� � orth. proj. onto K j FJS MATH 612 5 / 43
Review (3): Krylov and Arnoldi Aq j = h 1 j q 1 + h 2 j q 2 + . . . + h jj q j + h j + 1 , j q j + 1 , j = 1 , . . . , n q 1 A q 2 . . . q n � �� � Q n h 11 h 12 h 13 . . . h 1 n h 21 h 22 h 23 . . . h 2 n ... h 32 h 33 h 3 n q 1 = q 2 . . . q n q n + 1 . ... ... . . � �� � h n , n − 1 h nn Q n + 1 h n + 1 , n � �� � � H n FJS MATH 612 6 / 43
Review (4): Krylov and Arnoldi AQ n = Q n + 1 � H n A is A Q n contains an orthonormal basis of K n Q n + 1 contains one more vector, making for an orthonormal basis of K n + 1 � H is ( n + 1 ) × n Hessenberg And now a computation and a definition: n Q n + 1 � Q ∗ n AQ n = Q ∗ H n = H n where H n is n × n Hessenberg, obtained by removing the last row of � H n . Its eigenvalues are called Ritz values of A . They are not eigenvalues of A , but they approximate them. FJS MATH 612 7 / 43
An optimization problem Assume that we have x 0 , a first guess for the solution of Ax = b . We can think of a modified problem Ae = r , r = b − Ax 0 , e = x − x 0 . We are going to try and find e . For the same price, forget about x 0 (the book does –on purpose–) and think that x 0 = 0. A sort of frozen step of GMRES consists of minimizing � b − Ax � 2 with x ∈ K n FJS MATH 612 8 / 43
A sequence of equivalent problems Minimize � b − Ax � 2 with x ∈ K n Minimize � AK n c − b � 2 with c ∈ C n Minimize � AQ n y − b � 2 with y ∈ C n Minimize � Q n + 1 � H n y − b � 2 with y ∈ C n Minimize � � H n y − Q ∗ n + 1 b � 2 with y ∈ C n Minimize � � H n y − � b � 2 e 1 � 2 with y ∈ C n . If we can find a QR decomposition of the rectangular ( n + 1 ) × n Hessenberg matrix � H n , we are done. Why? FJS MATH 612 9 / 43
A skeleton of GMRES x 0 = 0 Start with b and compute q 1 = ( 1 / � b � 2 ) b for j ≥ 1 Compute the vector q j + 1 in Arnoldi’s method Compute and store the j -th column of � H j Compute a QR decomposition of � H j Minimize � � H j y j − � b � 2 e 1 � 2 with y j ∈ C j Compute x j = Q j y j ≈ x Decide if x j is close enough to x j − 1 . Stop if it is. end Small variation. Start with x 0 � = 0 and substitute b by r 0 = b − Ax 0 . In this case x j ∈ x 0 + K j , so what’s in the Krylov space is the error correction x j − x 0 and not x j itself. And the Krylov space is triggered by r 0 , not b . FJS MATH 612 10 / 43
GIVENS AND THE MISSING LINK FJS MATH 612 11 / 43
Givens rotations: facts Givens’ method is similar to the Householder method but it performs rotations instead of reflections to make zeros. The resulting orthogonal matrices (for the simple steps) are not symmetric. A Givens rotation G ij ( θ ) locates the block � cos ( θ ) � sin ( θ ) − sin ( θ ) cos ( θ ) in the ( i , j ) × ( i , j ) submatrix. Other than that it is an identity. G ij ( − θ ) = G ij ( θ ) − 1 . Multiplying G ij ( θ ) A involves only the i and j rows of A . It’s quite cheap therefore. Givens rotations can be applied to triangularize a matrix: ( n − 1 ) + ( n − 2 ) + . . . + 1 = n ( n − 1 ) rotations are needed. 2 FJS MATH 612 12 / 43
Givens rotations for Hessenberg matrices If H is an m × n Hessenberg matrix (many zero rows at the end have been added to customize the rotation matrices size), then G n , n − 1 ( θ n − 1 ) . . . G 32 ( θ 2 ) G 21 ( θ 1 ) H = R is upper triangular. To minimize � Hx − b � , solve Rx = G n , n − 1 ( θ n − 1 ) . . . G 32 ( θ 2 ) G 21 ( θ 1 ) b = b n . FJS MATH 612 13 / 43
Progressive Hessenberg matrices We have worked with a Hessenberg m × n matrix � H n , and a fixed right hand side k e 1 . (Note that we have artificially added zeros to the end of the matrix and of the RHS. In GMRES they were shorter. The minimization problem is the same. Why?) We want to compute the next one, that is, how much of what we used can we re-use? Before... G n , n − 1 ( θ n − 1 ) . . . G 32 ( θ 2 ) G 21 ( θ 1 ) � H n = R n ... and after... G n , n ( θ n ) G n , n − 1 ( θ n − 1 ) . . . G 32 ( θ 2 ) G 21 ( θ 1 ) � H n + 1 = R n + 1 so modify the RHS in the same way G n , n ( θ n ) G n , n − 1 ( θ n − 1 ) . . . G 32 ( θ 2 ) G 21 ( θ 1 ) b = G n , n ( θ n ) b n . FJS MATH 612 14 / 43
Final words and we are done In GMRES/Arnoldi/Givens, in one step... We compute q n + 1 and the ( n + 1 ) -th column of � H n + 1 We apply all past rotations to the new column (we couldn’t do it before, because we didn’t know it) We find the new rotation to make � H n + 1 triangular. This adds one column to the right of upper triangular matrix. Nothing else is modified. Why? We apply the new rotation to the RHS We solve the new upper triangular system FJS MATH 612 15 / 43
TWO OBSERVATIONS FJS MATH 612 16 / 43
A frozen step of GMRES � Ax − b � 2 = minimum , x = K n c , c ∈ C n . Once again, the Krylov matrix is implicit to the Arnoldi process even if never explicitly produced b A n − 1 b K n = Ab . . . c 1 b + c 2 Ab + . . . + c n A n − 1 b K n c = c 1 Ab + c 2 A 2 b + . . . + c n A n b AK n c = b − c 1 Ab − c 2 A 2 b − . . . − c n A n b b − AK n c = ( I − c 1 A − c 2 A 2 − . . . − c n A n ) b = p n ( A ) b = where p n is a polynomial of degree n with p n ( 0 ) = 1. FJS MATH 612 17 / 43
An equivalent optimization problem Find a polynomial p n ( x ) = 1 − c 1 x − . . . − c n x n making � p n ( A ) b � 2 minimum . With those coefficients we compute x n = K n c ≈ x = A − 1 b Believe it or not, this is how GMRES convergence is analyzed. Warning. With exact arithmetic GMRES delivers the exact solution after n steps. Why? If we need to take n steps, GMRES requires storing the large Q j matrices (basically nothing else, and A is only used as an operator). If we restart GMRES after k iterations (remember x 0 ?), then the method is called restarted GMRES or GMRES( k ). FJS MATH 612 18 / 43
When to solve and stop We had phrased the stopping criterion as Minimize � � H j y j − � b � 2 e 1 � 2 with y j ∈ C j Compute x j = Q j y j ≈ x Decide if x j is close enough to x j − 1 . Stop if it is. What we do is H j = � Compute a full QR decomposition � Q j � R j Define ξ = � Q ∗ j ( � b � 2 e 1 ) ∈ C j + 1 Solve � R j y j = ξ j ignoring the last row Then res:= � � H j y j − � b � 2 e 1 � 2 = abs of last comp. of ξ j If res < tol Compute x j = Q j y j Stop end FJS MATH 612 19 / 43
LANCZOS FJS MATH 612 20 / 43
Arnoldi becomes Lanczos (1): Arnoldi’s iteration ... for j ≥ 1 % count on the next v = Aq j % the newcomer for i = 1 : j h ij = q ∗ i v % not R anymore v = v − h ij q i end ... end Aq j = h 1 j q 1 + h 2 j q 2 + . . . + h jj q j + h j + 1 , j q j + 1 We are going to focus now on the case of Hermitian matrices. FJS MATH 612 21 / 43
Arnoldi becomes Lanczos (2): the key property Assume that j ≥ 3 (we are trying to compute q j + 1 ). We create v = Aq j . We then compute h 1 j = q ∗ 1 Aq j = ( Aq 1 ) ∗ q j = 0 , � �� � ���� ∈K 2 ⊥K j − 1 and we correct v = v − h 1 j q 1 = v = Aq j . Proceeding by induction h ij = 0 i < j − 1 ⇐ ⇒ i + 1 < j . This means that the j column of the Hessenberg matrix � H n contains only three entries: h j − 1 , j , h j , j , h j + 1 , j . FJS MATH 612 22 / 43
Recommend
More recommend