Tikhonov regularization and matrix function evaluation Paolo Novati Department of Pure and Applied Mathematics, University of Padova - Italy Joint work with Maria Rosaria Russo (Padova - Italy) Michela Redivo Zaglia (Padova - Italy) February 2012 Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 1 / 27
The problem We consider ill-conditioned linear systems Ax = b We mainly focus the attention on full-rank problems in which the singular values of A decay gradually to zero. Discretization of compact operators, as in the case of Fredholm integral equations of the …rst kind. Vandermonde type systems arising from interpolation. Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 2 / 27
Motivations We want to construct an iterative solver able to overcome some of the typical drawback of the classical iterative solvers: Semi-convergence : the iterates initially approach the solution but quite rapidly diverge Strong dependence on the parameter-choice strategy : in order to prevent divergence a reliable stopping criterium has to be used Poor accuracy : typically holds for Krylov type methods based on the use of A T A (CGLS) Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 3 / 27
Outline Reformulation of the problem in term of a matrix function evaluation Extension to Tikhonov regularization Theoretical and numerical error analysis Filter factor analysis The choice of the regularization parameters Numerical experiments Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 4 / 27
Reformulation of the problem The basic idea is to solve the system Ax = b in two steps, …rst solving in some way the regularized system ( A + λ I ) x λ = b (Franklin’s method) and then recovering the solution x from the system ( A + λ I ) � 1 Ax = x λ that is equivalent to compute x = f ( A ) x λ where f ( z ) = 1 + λ z � 1 Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 5 / 27
The choice of lambda By the de…nition of f the attainable accuracy depends on the the conditioning of ( A + λ I ) � 1 A . Theoretically the best situation is attained de…ning λ such that κ ( A + λ I ) = κ (( A + λ I ) � 1 A ) In the SPD case taking q p λ = λ 1 λ N � 1 / κ ( A ) where λ 1 and λ N are respectively the smallest and the largest eigenvalue of A , we obtain q κ ( A + λ I ) = κ (( A + λ I ) � 1 A ) = κ ( A ) Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 6 / 27
The computation of the matrix function - The Arnoldi algorithm For the computation of f ( A ) x λ we use the standard Arnoldi method projecting the matrix A onto the Krylov subspaces generated by A and x λ K m ( A , x λ ) = span f x λ , Ax λ , ..., A m � 1 x λ g For the construction of the subspaces K m ( A , x λ ) the Arnoldi algorithm generates an orthonormal sequence f v j g j � 1 , with v 1 = x λ / k x λ k , such that K m ( A , x λ ) = span f v 1 , v 2 , ..., v m g . For every m , AV m = V m H m + h m + 1 , m v m + 1 e T m . where V m = [ v 1 , v 2 , ..., v m ] , H m is an upper Hessenberg matrix with entries h i , j = v T i Av j and e j is the j -th vector of the canonical basis of R m . Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 7 / 27
The computation of the matrix function - The ASP method The m -th Arnoldi approximation to x = f ( A ) x λ is de…ned as x m = k x λ k V m f ( H m ) e 1 At each step of the Arnoldi algorithm we have to compute the vector w j = Av j . The method theoretically converges in a …nite number of steps. For the computation of f ( H m ) we employ the Schur-Parlett algorithm (Golub and Van Loan 1983). Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 8 / 27
Extension to Tikhonov regularization - The ATP method The method can be extended to problems in which the exact right hand side b is a¤ected by noise. We assume to work with a perturbed right-hand side b = b + e b . Given λ and H we solve the regularized system ( A T A + λ H T H ) x λ = A T b . and then we approximate x by computing � � � 1 A T A ( A T A + λ H T H ) x λ = f ( Q ) x λ x = � � � 1 � � H T H A T A where Q = . As before, for the computation of f ( Q ) x λ we use the standard Arnoldi method projecting the matrix Q onto the Krylov subspaces generated by Q and x λ . Now, at each step we have have to compute the vectors w j = Qv j , j � 1, with v 1 = x λ / k x λ k , that is, to � � � � H T H A T A solve the systems w j = v j . Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 9 / 27
Theoretical error analysis The …eld of values of A is de…ned as � y H Ay � y H y , y 2 C N n f 0 g F ( A ) : = Theorem Assume that F ( A ) � C + . Then λ m a m + 1 ∏ k f ( A ) x λ � k x λ k V m f ( H m ) e 1 k � K i = 1 h i + 1 , i k x λ k , where a > 0 is the leftmost point of F ( A ) and 2 � K � 11 . 08 (Crouzeix 2007; in the symmetric case K = 1 ). Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 10 / 27
Some general considerations The rate of convergence is almost independent of the choice of λ , and this is con…rmed by the numerical experiments. The error decay is related with the rate of the decay of ∏ m i = 1 h i + 1 , i . Theorem (From a result by Nevanlinna 1993) Let σ j , j � 1 , be the singular values of an operator A. If σ p ∑ j < ∞ for a certain p > 0 j � 1 then � η ep � m / p m ∏ i = 1 h i + 1 , i � m where η � 1 + p σ p ∑ j p j � 1 Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 11 / 27
Some general considerations For discrete ill-posed problems the rate of decay of ∏ m i = 1 h i + 1 , i is superlinear and depends on the p -summability of the singular values of A , i.e., on the degree of ill-posedness of the problem. Each Arnoldi-based method (CG, FOM, GMRES) shows the same rate of convergence Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 12 / 27
Error analysis in computer arithmetics for the ASP method We need to assume that x λ , solution of ( A + λ I ) x λ = b , is approximated by x λ with an accuracy depending on the choice of λ and the method used. In this way, the Arnoldi algorithm actually constructs the Krylov subspaces K m ( A , x λ ) . Hence for the error E m : = x � k x λ k V m f ( H m ) e 1 we have k E m k = k f ( A ) x λ � k x λ k V m f ( H m ) e 1 k � k f ( A ) x λ � k x λ k V m f ( H m ) e 1 k + k f ( A ) ( x λ � x λ ) k For small values of λ , f ( A ) � I and we have that k E m k � k x λ � x λ k . The method is not able to improve the accuracy provided by the solution of the initial system. For large λ we have that x λ � x λ , but even assuming that k f ( A ) ( x λ � x λ ) k � 0, we have another lower bound due the ill-conditioning of f ( A ) = A � 1 ( A + λ I ) . Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 13 / 27
Error analysis in computer arithmetics for the ATP method The error is given by E m : = f ( Q ) x λ � p m � 1 ( Q ) x λ ( p m � 1 interpolates f at the eigenvalues of Q ) where ( A T A + λ H T H ) x λ = A T b and ( A T A + λ H T H ) x λ = A T b . As before we can write k E m k � k f ( Q ) x λ � p m � 1 ( Q ) x λ k + k f ( Q ) ( x λ � x λ ) k . Theoretically we may choose λ very large, thus oversmoothing, in order to reduce the e¤ect of noise. Unfortunately, the main problem is that, as before, f ( Q ) may be ill-conditioned for λ large. Even in this case we should …nd a compromise for the selection of a suitable value of λ , but contrary to the ASP method for noise-free problems it is di¢cult to design a theoretical strategy. Indeed everything depends on the problem and on the operator H . Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 14 / 27
Filter factors Assume that A is diagonalizable, that is, A = XDX � 1 where D = diag ( λ 1 , ..., λ N ) . For the ASP method we have � � X � 1 b N λ i i ∑ x λ = x i , λ i + λ λ i i = 1 where x i is the eigenvector associated with λ i . After the …rst phase, the …lter factors are thus g i = λ i ( λ i + λ ) � 1 . The Arnoldi algorithm produces approximations of the type x m = p m � 1 ( A ) x λ , where p m � 1 interpolates the function f at the eigenvalues of H m . Hence � � X � 1 b N λ i p m � 1 ( λ i ) i ∑ x m = x i . λ i + λ λ i i = 1 and the …lter factors are given by = λ i p m � 1 ( λ i ) f ( m ) , i = 1 , ..., N . i λ i + λ Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 15 / 27
Filter factors - an example GRAVITY(12) - Filter factors g i (asterisk) and f ( m ) (circle) with p i m = 4 , 6 , 8 , 10. λ = 1 / κ ( A ) m = 4 m = 6 1 .5 1 .5 1 1 0 .5 0 .5 0 0 - 0 .5 - 1 - 0 .5 0 5 1 0 1 5 0 5 1 0 1 5 m = 8 m = 1 0 1 .5 1 .5 1 1 0 .5 0 .5 0 0 - 0 .5 - 0 .5 0 5 1 0 1 5 0 5 1 0 1 5 The Arnoldi (Lanczos) algorithm initially picks up the largest eigenvalues, hence it automatically corrects the …lters corresponding to the low-middle frequencies ( g i ! f ( m ) � 1), keep damping the highest ones. i Paolo Novati - University of Padova () Tikhonov - matrix function February 2012 16 / 27
Recommend
More recommend