Parallel Solution of Symmetric Eigenvalue Problems Zack 2/21/2014
• Typically, the eigenvalue problem is solved in three successive steps: – Reduction to tridiagonal form – Solution of the tridiagonal eigenproblem – Back transformation of the computed eigenvectors to those of the original matrix
• (2005) Compute all eigenpairs of a 15,000*15,000 matrix on 16 processor: – 546s for reduction – 22.2s for tridiagonal solution – 160s for backtransformation
Reduction to banded (full matrix) • Householder transformation • Original one-step tridiagonalization
If we only reduce the matrix to banded form, the update can be done with a block orthogonal transformation, such as WY representation 𝑅 = 𝐽 + 𝑋𝑍 𝑈 with width-b matrices W and Y, then the vast majority of the operations can be done with highly efficient BLAS-3 routines instead of memory bandwidth-limited BLAS-2 or BLAS-1.
Band-to-tridiagonal
Band-to-tridiagonal
Band-to-tridiagonal
Band-to-tridiagonal
Band-to-tridiagonal
Band-to-tridiagonal
• Algorithm
Reduction to banded (sparse matrix) • Cuthill-McKee(CMK) ordering
1 1 𝑦 2 … 𝑦 2 … 𝑦 5 3 → 4 𝑦 3 𝑦 𝑦 𝑦 5 6 𝑦 𝑦 6 𝑦 4 𝑦 7 𝑦 7 Band 3 band 1 With CMK method, we can reduce the band without flop computation
1 1 2 … 2 … 3 3 → 4 𝑦 𝑦 𝑦 7 𝑦 5 5 𝑦 6 6 𝑦 𝑦 𝑦 𝑦 𝑦 𝑦 𝑦 7 4 Band 6 band 3 if some node has large degree, it is hard to reduce band to a small number with permutation.
• A possible way to deal with the nodes which has large degree: 1. move these nodes to bottom; 2. solve the eigen-update 𝑦 𝑧 = 𝜇 𝑦 𝐵 𝑐 𝑧 𝑐 𝑈 𝑑 where A is banded matrix, which is easy to be decomposed, b is dense vector.
𝑧 = 𝜇 𝑦 𝑦 𝐵 𝑐 • 𝑧 𝑐 𝑈 𝑑 𝐵𝑦 + 𝑐𝑧 = 𝜇𝑦 𝑐 𝑈 𝑦 + 𝑑𝑧 = 𝜇𝑧 𝑐𝑐 𝑈 𝐵𝑦 + 𝜇−𝑑 𝑦 = 𝜇𝑦 𝑐𝑐 𝑈 𝐵 + 𝜇−𝑑 − 𝜇𝐽 𝑦 = 0
𝑐𝑐 𝑈 • 𝐵 + 𝜇−𝑑 − 𝜇𝐽 𝑦 = 0 Suppose that 𝐵 = 𝐹𝐸𝐹 𝑈 det 𝐸 + 𝑨𝑨 𝑈 𝜇 − 𝑑 − 𝜇𝐽 = 0
𝑨𝑨 𝑈 • det 𝐸 + 𝜇−𝑑 − 𝜇𝐽 = 0 • Secular equation: 2 𝑨 𝑗 1 + 𝑗 𝜇−𝑑 = 0 𝑒 𝑗 −𝜇 which is similar to rank-1 update
Band-to-tridiagonal (non-uniform)
Tridiagonal eigensolver Combination of Able to compute a Lose orthogonality O(k 2 n) bisection and subset of k inverse eigenpairs at iteration(B&I) reduced cost MRRR O(kn) O(n 3 ) QR/QL Designed to Good eigensystem compute either all or no eigenvectors Divide-and- conquer(D&C)
Divide-and-conquer algorithm • Step 1, Split the original tridiagonal matrix could be split into two half sized submatrices:
• Step 2, solve subproblems when subproblems are small enough, call QR/QL method, then we have where, w
• Step 3, deflate (1) some components z i of z are (almost) zero, (2) two elements in diagonal matrix are identical (or close), then zero elments can be generated in z
• Step 4, rank-1 update eigendecomposition 2 𝜂 𝑗 𝑜 – Solve secular equation 1 + σ 𝑗=1 𝑒 𝑗 −𝜇 = 0
Back transformation to eigenvectors • Eigenvectors of T eigenvectors of A • 𝐹 (𝐵) = 𝐼 1 𝐼 2 … 𝐼 𝑞 𝐹 (𝑈) --- O(n 2 p) – 1D data layout – 2D data layout
Thank you
Recommend
More recommend