ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics - PowerPoint PPT Presentation

ECS 231 Subspace projection methods for LS 1 / 38

Part I. Basics The landscape of solvers for linear systems of equations Ax = b, 2 / 38

Part I. Basics The landscape of solvers for linear systems of equations Ax = b, more robust ← − − − → less storage Direct Iterative more general ( u = Av ) ↑ | | Nonsymmetric A LU GMRES | ↓ more robust Symmetric positive definite A Cholesky CG 2 / 38

Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . 3 / 38

Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) 3 / 38

Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) In other words, let r 0 = b − Ax 0 , then b − A � x = b − A ( x 0 + z ) = r 0 − Az. (1) is equivalent to find z ∈ W s.t. r 0 − Az ⊥ V . (1a) 3 / 38

Part I. Basics A framework for subspace projection methods . ◮ The basic idea: ◮ extract an approximate solution � x from a subspace of R n . ◮ a technique of dimension reduction . ◮ Mathematically, let W , V ⊆ R n , and x 0 is an initial guess of the solution, then the subspace projection technique is to find � x ∈ x 0 + z , z ∈ W s.t. b − A � x ⊥ V . (1) In other words, let r 0 = b − Ax 0 , then b − A � x = b − A ( x 0 + z ) = r 0 − Az. (1) is equivalent to find z ∈ W s.t. r 0 − Az ⊥ V . (1a) ◮ Orthogonal projection : W = V , ◮ Oblique projection : W � = V , 3 / 38

Part I. Basics In matrix notation, let V = [ v 1 , v 2 , . . . , v m ] be a basis of V , and W = [ w 1 , w 2 , . . . , w m ] be a basis of W . Then any approximation solution � x = x 0 + z = x 0 + Wy and the orthogonality condition (1a) implies V T ( r 0 − Az ) = 0 . 4 / 38

Part I. Basics In matrix notation, let V = [ v 1 , v 2 , . . . , v m ] be a basis of V , and W = [ w 1 , w 2 , . . . , w m ] be a basis of W . Then any approximation solution � x = x 0 + z = x 0 + Wy and the orthogonality condition (1a) implies V T ( r 0 − Az ) = 0 . Thus we have V T AW y = V T r 0 . Thus assuming V T AW is invertible, a new approximate solution � x : x = x 0 + W ( V T AW ) − 1 V T r 0 . � 4 / 38

Part I. Basics Prototype iterative subspace projection technique : Prototype Projection Method: 0. Let x 0 be an initial approximation 1. Iterate until convergence: Select a pair of subspaces V and W of R n 2. 3. Generate basis matrices V and W for V and W 4. r 0 ← b − Ax 0 y ← ( V T AW ) − 1 V T r 0 5. 6. x 0 ← x 0 + Wy 5 / 38

Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 6 / 38

Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 6 / 38

Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 1. If A is symmetric positive definite (SPD) and W = V , then V T AW = W T AW is also SPD (and nonsingular). 6 / 38

Part I. Basics Prototype iterative subspace projection technique , cont’d Remarks: 1. The matrix V T AW does not have to be formed explicitly, typically a by-product of Steps 2 and 3. 2. There are two important cases where the nonsingularity of V T AW is guaranteed: 1. If A is symmetric positive definite (SPD) and W = V , then V T AW = W T AW is also SPD (and nonsingular). 2. If A is nonsingular, and V = A W , then V T AW = W T A T AW , which is SPD (and nonsingular). 6 / 38

Part I. Basics Prototype iterative subspace projection technique in one-dimension : W = span { w } and V = span { v } , The new approximation takes form x 0 ← x 0 + z = x 0 + αw and the orthogonality condition (1a) implies v T ( r 0 − Az ) = v T ( r 0 − αAw ) = 0 , and thus α = v T r 0 v T Aw. 7 / 38

Part I. Basics Steepest Descent (SD) method When A is SPD, at each step, take v = w = r 0 = b − Ax 0 This yields Steepest Descent (SD) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k r T k r k 4. α k = r T k Ar k 5. x k +1 = x k + α k r k 8 / 38

Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . 9 / 38

Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . ◮ Let x ∗ = A − 1 b , then k -th step of the SD iteration minimizes f ( x ) ≡ 1 A = 1 2( x ∗ − x ) T A ( x ∗ − x ) , 2 � x ∗ − x � 2 x ∗ = A − 1 b over all vectors of the form x k − α ( ∇ f ( x k )) , known as line search . This is equivalent to α k = argmin α f ( x k − 1 − α · ∇ f ( x k )) , where ∇ f ( x k ) = b − Ax k is the gradient of f at x k . 9 / 38

Part I. Basics Steepest Descent (SD) method , cont’d Remarks: ◮ Since A is SPD, r T k Ar k > 0 except r k = 0 . ◮ Let x ∗ = A − 1 b , then k -th step of the SD iteration minimizes f ( x ) ≡ 1 A = 1 2( x ∗ − x ) T A ( x ∗ − x ) , 2 � x ∗ − x � 2 x ∗ = A − 1 b over all vectors of the form x k − α ( ∇ f ( x k )) , known as line search . This is equivalent to α k = argmin α f ( x k − 1 − α · ∇ f ( x k )) , where ∇ f ( x k ) = b − Ax k is the gradient of f at x k . ◮ Recall that from Calculus, the negative of the gradient direction is locally the direction that yields the fastest rate of decrease for f . 9 / 38

Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , 10 / 38

Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k 10 / 38

Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k Remark: 10 / 38

Part I. Basics Minimal Residual (MR) Iteration. For a general nonsingular matrix A , at each step, let w = r 0 and v = Ar 0 , It yields Minimal Residual (MR) Algorithm: 1. Pick an initial guess x 0 2. For k = 0 , 1 , 2 , . . . until convergence do 3. r k = b − Ax k k A T r k r T 4. α k = k A T Ar k r T 5. x k +1 = x k + α k r k Remark: each iteration minimizes f ( x ) ≡ � r � 2 2 = � b − Ax � 2 2 over all vectors of the form x k − αr k , namely line search , which is equivalent to solve the least squares problem min α � b − A ( x k − αr k ) � 2 . 10 / 38

Part II: Krylov subspace and GMRES Krylov subspace is defined as K m ( A, v ) = span { v, Av, A 2 v, . . . , A m − 1 v } , Note that if x ∈ K m ( A, v ) , then x = p ( A ) v, where p ( A ) is a polynomial of degree not exceeding m − 1 . 11 / 38

Part II: Krylov subspace and GMRES Arnoldi procedure is an algorithm for building an orthonormal basis { v 1 , v 2 , . . . , v m } of the Krylov subspace K m ( A, v ) using a modified Gram-Schmidt orthogonalization process. 1. v 1 = v/ � v � 2 2. for j = 1 , 2 , . . . , m 3. compute w = Av j 4. for i = 1 , 2 , . . . , j h ij = v T 5. i w 6. w := w − h ij v i 7. end for 8. h j +1 ,j = � w � 2 9. If h j +1 ,j = 0 , stop 10. v j +1 = w/h j +1 ,j 11. endfor 12 / 38

Part II: Krylov subspace and GMRES Proposition. Assume that h j +1 ,j � = 0 for j = 1 , 2 , . . . , m , then the vectors { v 1 , v 2 , . . . , v m } form an orthonormal basis of the Krylov subspace K m ( A, v ) : span { v 1 , v 2 , . . . , v m } = K m ( A, v ) . Proof. By induction. 13 / 38

Part II: Krylov subspace and GMRES Let V m = [ v 1 , v 2 , . . . , v m ] and H m = ( h ij ) = Upper Hessenberg . Then in the matrix form, the Arnoldi procedure can be expressed by the following order- m Arnoldi decompositions : m = V m +1 � AV m = V m H m + h m +1 ,m v m +1 e T H m , (2) where V T m V m = I m , V T m v m +1 = 0 and � v m +1 � 2 = 1 . In addition, we denote � � H m � V m +1 = [ V m v m +1 ] and H m = , h m +1 ,m e T m 14 / 38

ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics - PowerPoint PPT Presentation

ECS 231 Subspace projection methods for LS 1 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, 2 / 38 Part I. Basics The landscape of solvers for linear systems of equations Ax = b, more robust

Subspace Polynomials and Cyclic Subspace Codes Netanel Raviv Joint work with: Prof. Tuvi Etzion

Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU)

Graph based Subspace Segmentation Canyi Lu National University of Singapore Nov. 21, 2013

ECS 231 Gradient descent methods for solving large scale eigenvalue problems 1 / 17 Generalized

Overview Focus Projection Focus Projection Focus to Accent Focus to Accent Restricted View of

Remedial Action Report to Council 231 233 George Street C OUNCIL R EMEDIAL A CTION R EPORT

Groups Group is a set G with an operator Closure Associative property Identity

Cyclic Subspace Codes Via Subspace Polynomials Kamil Otal and Ferruh zbudak Middle East

Subspace Modeling and Selection Subspace Modeling and Selection for Noisy Speech Recognition for

Subspace Embeddings for Regression Lecture 12 October 1, 2020 Chandra (UIUC) CS498ABD 1 Fall

Subspace Embeddings and p -Regression Using Exponential Random Variables David P. Woodruff

Subspace Information Criterion Subspace Information Criterion for Image Restoration for Image

Subspace Clustering Ensemble Clustering Subspace Clustering, Ensemble Clustering, Alternative

Math 211 Math 211 Lecture #21 Determinants October 16, 2002 2 Basis of a Subspace Basis of a

ECS 231 Computer Arithmetic 1 / 27 Outline Floating-point numbers and representations 1

ECS 231 Lecture on Approximation and Error Analysis 1 / 9 Approximation and error analysis 1.

Performance Analysis Brendan Gregg Senior Performance Architect

Blitzableiter Countering Flash Exploits Robert Tezli Jrn Bratzke 23rd Annual FIRST Conference

ESO Science Archives Felix Stoehr Meeting XYZ, X Dec 2020, Confidential/ESO Internal Use/ 1

Comparison of Block-Lanczos and Block-Wiedemann for Solving Linear Systems in Large

Mastering PostgreSQL Administration B RUCE M OMJIAN P OSTGRE SQL is an open-source, full-featured

Current status theory and simulations ALEGRO WG3 J. Vieira web.ist.utl.pt/ jorge.vieira epp

Visual Designer An introduction into the capabilities of the IDesignerHost interface and how it

Overview of SLEPc and Recent Additions Jose E. Roman D. Sistemes Inform` atics i Computaci o