The Faber-Manteuffel Theorem and its Consequences Petr Tichý joint work with Vance Faber, Jörg Liesen Czech Academy of Sciences July 21, 2011 ICIAM 2011, Vancouver, BC, Canada 1
Optimal Krylov subspace methods and low memory requirements? Consider a system of linear algebraic equations A x = b A ∈ R n × n is nonsingular, b ∈ R n . Given x 0 , find an optimal x j ∈ x 0 + K j ( A , r 0 ) so that the error is minimized in a given vector norm. What are necessary and sufficient conditions on A so that the optimal x j can be computed using short recurrences? (only a constant number of vectors is needed) 2
Examples of optimal Krylov subspace methods with short recurrences CG [Hestenes, Stiefel 1952] , MINRES, SYMMLQ [Paige, Saunders 1975] Optimal in the sense that they minimize some error norm: � x − x j � A in CG, � x − x j � A T A = � r j � in MINRES, � x − x j � in SYMMLQ - here x j ∈ x 0 + A K j ( A , r 0 ) . Generate orthogonal (or A -orthogonal) Krylov subspace basis using a three-term recurrence, r j +1 = γ j A r j − α j r j − β j r j − 1 . An important assumption: A is symmetric (MINRES, SYMMLQ) and positive definite (CG). 3
Gene Golub By the end of the 1970s it was unknown if such methods existed also for general unsymmetric A . Gatlinburg VIII (now Householder Symposium) held in Oxford in 1981. “A prize of $500 has been offered by Gene Golub for the construction of a 3-term conjugate gradient like descent method for non-symmetric real matrices or a proof that there G. H. Golub, 1932–2007 can be no such method”. 4
What kind of method Golub had in mind We want to solve A x = b using CG-like descent method: error is minimized in some given inner product norm, � · � B = �· , ·� 1 / 2 B . Starting from x 0 , compute x j +1 = x j + α j p j , j = 0 , 1 , . . . , p j is a direction vector, α j is a scalar (to be determined), span { p 0 , . . . , p j } = K j +1 ( A , r 0 ) , r 0 = b − A x 0 . � x − x j +1 � B is minimal iff α j = � x − x j , p j � B � p j , p i � B = 0 . and � p j , p j � B p 0 , . . . , p j has to be a B -orthogonal basis of K j +1 ( A , r 0 ) . 5
Optimal Krylov subspace method with short recurrences The question about the existence of an optimal Krylov subspace method with short recurrences can be reduced to the question: For which A is it possible to generate a B -orthogonal basis of the Krylov subspace using short recurrences? (for each initial starting vector) 6
Faber, Manteuffel 1984 Faber and Manteuffel gave the answer in 1984: For a general matrix A there exists no short recurrence for generating orthogonal Krylov subspace bases. What are the details of this statement ? 7
Outline The Faber-Manteuffel theorem 1 2 Ideas of a new proof 3 Consequences Other types of recurrences 4 8
Formulation of the problem B -inner product, Input and Notation Without loss of generality, B = I . Otherwise change the basis: � x, y � B = � B 1 / 2 x, B 1 / 2 y � , A ≡ B 1 / 2 AB − 1 / 2 , ˆ v ≡ B 1 / 2 v . ˆ Input data : A ∈ C n × n , a nonsingular matrix. v ∈ C n , an initial vector. Notation: d min ( A ) . . . the degree of the minimal polynomial of A . d = d ( A , v ) . . . the grade of v with respect to A , the smallest d s.t. K d ( A , v ) is invariant under multiplication with A . 10
Formulation of the problem Our Goal Generate a basis v 1 , . . . , v d of K d ( A , v ) s.t. 1. span { v 1 , . . . , v j } = K j ( A , v ) , for j = 1 , . . . , d , 2. � v i , v j � = 0 , for i � = j , i, j = 1 , . . . , d . The Arnoldi algorithm: Standard way for generating the orthogonal basis (no normalization for convenience): v 1 ≡ v , j h i,j = � A v j , v i � � v j +1 = A v j − h i,j v i , � v i , v i � , i =1 j = 0 , . . . , d − 1 . 11
Formulation of the problem The Arnoldi algorithm - matrix representation In matrix notation: v 1 = v , · · · h 1 , 1 h 1 ,d − 1 . ... . 1 . ... A [ v 1 , . . . , v d − 1 ] = [ v 1 , . . . , v d ] , h d − 1 ,d − 1 � �� � � �� � ≡ V d − 1 ≡ V d 1 � �� � ≡ H d,d − 1 V ∗ d V d is diagonal , d = dim K n ( A , v ) . j � ( s + 2 )-term recurrence: v j +1 = A v j − h i,j v i . i = j − s 12
Formulation of the problem Optimal short recurrences (Definition - Liesen, Strakoš 2008) A admits an optimal ( s + 2) -term recurrence, if for any v , H d,d − 1 is at most ( s + 2) -band Hessenberg, and for at least one v , H d,d − 1 is ( s + 2) -band Hessenberg. s + 1 � �� � • · · · • ... ... • ... ... • A V d − 1 = V d . ... ... . . ... • • � �� � d − 1 Sufficient and necessary conditions on A ? 13
The Faber-Manteuffel theorem Definition . If A ∗ = p s ( A ) , where p s is a polynomial of the smallest possible degree s , A is called normal( s ). Theorem [Faber, Manteuffel 1984] , [Liesen, Strakoš 2008] Given nonsingular A and nonnegative s , s + 2 < d min ( A ) . A admits an optimal ( s + 2) -term recurrence if and only if A is normal( s ). Sufficiency is straightforward, necessity is not . Key words from the proof of necessity in [Faber, Manteuffel 1984] include: “continuous function” (analysis), “closed set of smaller dimension” (topology), “wedge product” (multilinear algebra). 14
A new proof of the Faber-Manteuffel theorem Motivated by the paper [Liesen, Strakoš 2008] which contains a completely reworked theory of short recurrences for generating orthogonal Krylov subspace bases . “It is unknown if a simpler proof of the necessity part can be found. In view of the fundamental nature of the Faber-Manteuffel Theorem, such proof would be a welcome addition to the existing literature. It would lead to a better understanding of the theorem by enlightening some (possibly unexpected) relationships, and it would also be more suitable for classroom teaching.” In [Faber, Liesen, T. 2008] we give two new proofs of the Faber-Manteuffel theorem that use more elementary tools. 16
Extension of A V d − 1 = V d H d,d − 1 Matrix representation of A in V d Since K d ( A , v ) is invariant, A v d ∈ K d ( A , v ) and d � A v d = h i,d v i . i =1 s + 1 � �� � • · · · • • . ... ... . • . ... ... • • A V d = V d . . ... ... . . . . ... • • • • � �� � d − 1 17
Idea of the proof Unitary transformation of the upper Hessenberg matrix (for simplicity, we omit indices by V d and H d,d ) Proof by contradiction. Let A admit an optimal ( s + 2) -term recurrence and A not be normal ( s ) . Then there exists a starting vector v such that h 1 ,d � = 0 . • • · · · • . ... ... . • . ... ... • • A ( VG ) = ( VG ) G ∗ G . . ... ... . . . . ... • • • • Find unitary G such that G ∗ HG is unreduced upper Hessenberg, but G ∗ HG is not ( s + 2) -band (up to the last column). 18
Faber-Manteuffel Theorem – Summary Generating an orthogonal basis of K d ( A , v ) via Arnoldi-type recurrence When is A normal ( s ) ? Arnoldi-type recurrence ( s + 2) -term A is normal and [Faber, Manteuffel 1984] , � [Khavinson, Świa ¸tek 2003] [Liesen, Strakoš 2008] A is normal(s) 1. s = 1 if and only if the A ∗ = p ( A ) eigenvalues of A lie on a line in C . � 2. For s > 1 , A has at most 3 s − 2 different the only interesting case eigenvalues. is s = 1 , All classes of “interesting” collinear eigenvalues matrices are known. 19
When is A orthogonally reducible to ( s + 2) -band Hessenberg form? The matrix representation of the Arnoldi algorithm can be extended by one column to A V d = V d H d where H d ∈ C d × d is unreduced upper Hessenberg matrix. We say that A is orthogonally reducible to ( s + 2) -band Hessenberg form if H d is ( s + 2) -band Hessenberg matrix for each starting vector v 1 . What are necessary and sufficient conditions on A to be orthogonally reducible to ( s + 2) -band Hessenberg form? 21
When is A orthogonally reducible to ( s + 2) -band Hessenberg form? A is normal ( s ) , A ∗ = p ( A ) A admits A is reducible to ( s + 2) -term recurrence ( s +2) -band Hessenberg • · · · • • • · · · • . ... ... ... ... . • . • ... ... ... ... • • • . . . ... ... ... ... . . . • . . . . ... ... . • • • . • • • • 22
Recommend
More recommend