ecs 231 subspace projection methods for ls
play

ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics - PowerPoint PPT Presentation

ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, 2 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, more robust


  1. ECS 231 Subspace projection methods for LS 1 / 38

  2. Part I. Basics The landscape of solvers for linear systems of equations Ax = b, 2 / 38

  3. Part I. Basics The landscape of solvers for linear systems of equations Ax = b, more robust ← − − − → less storage Direct Iterative more general ( u = Av ) ↑ | | Nonsymmetric A LU GMRES | ↓ more robust Symmetric positive definite A Cholesky CG 2 / 38

  4. Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . 3 / 38

  5. Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) 3 / 38

  6. Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) In other words, let r 0 = b − Ax 0 , then b − A � x = b − A ( x 0 + z ) = r 0 − Az. (1) is equivalent to find z ∈ W s.t. r 0 − Az ⊥ V . (1a) 3 / 38

  7. Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) In other words, let r 0 = b − Ax 0 , then b − A � x = b − A ( x 0 + z ) = r 0 − Az. (1) is equivalent to find z ∈ W s.t. r 0 − Az ⊥ V . (1a) ◮ Orthogonal projection : W = V , ◮ Oblique projection : W � = V , 3 / 38

  8. Part I. Basics In matrix notation, let V = [ v 1 , v 2 , . . . , v m ] be a basis of V , and W = [ w 1 , w 2 , . . . , w m ] be a basis of W . Then any approximation solution � x = x 0 + z = x 0 + Wy and the orthogonality condition (1a) implies V T ( r 0 − Az ) = 0 . 4 / 38

  9. Part I. Basics In matrix notation, let V = [ v 1 , v 2 , . . . , v m ] be a basis of V , and W = [ w 1 , w 2 , . . . , w m ] be a basis of W . Then any approximation solution � x = x 0 + z = x 0 + Wy and the orthogonality condition (1a) implies V T ( r 0 − Az ) = 0 . Thus we have V T AW y = V T r 0 . Thus assuming V T AW is invertible, a new approximate solution � x : x = x 0 + W ( V T AW ) − 1 V T r 0 . � 4 / 38

  10. Part I. Basics Prototype iterative subspace projection technique : Prototype Projection Method: 0. Let x 0 be an initial approximation 1. Iterate until convergence: Select a pair of subspaces V and W of R n 2. 3. Generate basis matrices V and W for V and W 4. r 0 ← b − Ax 0 y ← ( V T AW ) − 1 V T r 0 5. 6. x 0 ← x 0 + Wy 5 / 38

  11. Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 6 / 38

  12. Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 6 / 38

  13. Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 1. If A is symmetric positive definite (SPD) and W = V , then V T AW = W T AW is also SPD (and nonsingular). 6 / 38

  14. Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 1. If A is symmetric positive definite (SPD) and W = V , then V T AW = W T AW is also SPD (and nonsingular). 2. If A is nonsingular, and V = A W , then V T AW = W T A T AW , which is SPD (and nonsingular). 6 / 38

  15. Part I. Basics Prototype iterative subspace projection technique in one-dimension : W = span { w } and V = span { v } , The new approximation takes form x 0 ← x 0 + z = x 0 + αw and the orthogonality condition (1a) implies v T ( r 0 − Az ) = v T ( r 0 − αAw ) = 0 , and thus α = v T r 0 v T Aw. 7 / 38

  16. Part I. Basics Steepest Descent (SD) method When A is SPD, at each step, take v = w = r 0 = b − Ax 0 This yields Steepest Descent (SD) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k r T k r k 4. α k = r T k Ar k 5. x k +1 = x k + α k r k 8 / 38

  17. Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . 9 / 38

  18. Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . ◮ Let x ∗ = A − 1 b , then k -th step of the SD iteration minimizes f ( x ) ≡ 1 A = 1 2( x ∗ − x ) T A ( x ∗ − x ) , 2 � x ∗ − x � 2 x ∗ = A − 1 b over all vectors of the form x k − α ( ∇ f ( x k )) , known as line search . This is equivalent to α k = argmin α f ( x k − 1 − α · ∇ f ( x k )) , where ∇ f ( x k ) = b − Ax k is the gradient of f at x k . 9 / 38

  19. Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . ◮ Let x ∗ = A − 1 b , then k -th step of the SD iteration minimizes f ( x ) ≡ 1 A = 1 2( x ∗ − x ) T A ( x ∗ − x ) , 2 � x ∗ − x � 2 x ∗ = A − 1 b over all vectors of the form x k − α ( ∇ f ( x k )) , known as line search . This is equivalent to α k = argmin α f ( x k − 1 − α · ∇ f ( x k )) , where ∇ f ( x k ) = b − Ax k is the gradient of f at x k . ◮ Recall that from Calculus, the negative of the gradient direction is locally the direction that yields the fastest rate of decrease for f . 9 / 38

  20. Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , 10 / 38

  21. Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k 10 / 38

  22. Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k Remark: 10 / 38

  23. Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k Remark: each iteration minimizes f ( x ) ≡ � r � 2 2 = � b − Ax � 2 2 over all vectors of the form x k − αr k , namely line search , which is equivalent to solve the least squares problem min α � b − A ( x k − αr k ) � 2 . 10 / 38

  24. Part II: Krylov subspace and GMRES Krylov subspace is defined as K m ( A, v ) = span { v, Av, A 2 v, . . . , A m − 1 v } , Note that if x ∈ K m ( A, v ) , then x = p ( A ) v, where p ( A ) is a polynomial of degree not exceeding m − 1 . 11 / 38

  25. Part II: Krylov subspace and GMRES Arnoldi procedure is an algorithm for building an orthonormal basis { v 1 , v 2 , . . . , v m } of the Krylov subspace K m ( A, v ) using a modified Gram-Schmidt orthogonalization process. 1. v 1 = v/ � v � 2 2. for j = 1 , 2 , . . . , m 3. compute w = Av j 4. for i = 1 , 2 , . . . , j h ij = v T 5. i w 6. w := w − h ij v i 7. end for 8. h j +1 ,j = � w � 2 9. If h j +1 ,j = 0 , stop 10. v j +1 = w/h j +1 ,j 11. endfor 12 / 38

  26. Part II: Krylov subspace and GMRES Proposition. Assume that h j +1 ,j � = 0 for j = 1 , 2 , . . . , m , then the vectors { v 1 , v 2 , . . . , v m } form an orthonormal basis of the Krylov subspace K m ( A, v ) : span { v 1 , v 2 , . . . , v m } = K m ( A, v ) . Proof. By induction. 13 / 38

  27. Part II: Krylov subspace and GMRES Let V m = [ v 1 , v 2 , . . . , v m ] and H m = ( h ij ) = Upper Hessenberg . Then in the matrix form, the Arnoldi procedure can be expressed by the following order- m Arnoldi decompositions : m = V m +1 � AV m = V m H m + h m +1 ,m v m +1 e T H m , (2) where V T m V m = I m , V T m v m +1 = 0 and � v m +1 � 2 = 1 . In addition, we denote � � H m � V m +1 = [ V m v m +1 ] and H m = , h m +1 ,m e T m 14 / 38

Recommend


More recommend