ECS231 Low-rank approximation revisited (Introduction to - PowerPoint PPT Presentation

ECS231 Low-rank approximation – revisited (Introduction to Randomized Algorithms) May 23, 2019 1 / 15

Outline 1. Review: low-rank approximation 2. Prototype randomized SVD algorithm 3. Accelerated randomized SVD algorithms 4. CUR decomposition 2 / 15

Review: optimak rank-k approximation ◮ The SVD of an m × n matrix A is defined by A = UΣV T , where U and V are m × m and n × n orthogonal matrices, respectively, Σ = diag ( σ 1 , σ 2 , . . . ) and σ 1 ≥ σ 2 ≥ · · · ≥ 0 . ◮ Computational cost O ( mn 2 ) , assuming m ≥ n . ◮ Rank- k truncated SVD of A : A k = U (: , 1: k ) · Σ (1: k, 1: k ) · V T (: , 1: k ) 3 / 15

Review: optimak rank-k approximation ◮ Eckart-Young theorem. min � A − B � 2 = � A − A k � 2 = σ k +1 rank ( B ) ≤ k   1 / 2 n �  σ 2  min � A − B � F = � A − A k � F = k +1 rank ( B ) ≤ k j = k +1 ◮ Theorem A. � A − QB � 2 F = � A − QB k � 2 min F , rank ( B ) ≤ k where Q is an m × p orthogonal matrix, and B k is the rank- k truncated SVD of Q T A , and 1 ≤ k ≤ p . Remark: Given m × n matrix A = ( aij ) , the Frobineous norm of A is defined by � 1 / 2 = ( trace ( AT A ))1 / 2 . �� m � n j =1 a 2 � A � F = i =1 ij 4 / 15

Prototype randomized SVD algorithm By Theorem A, we immediately have the following a prototype randomized SVD (low-rank approximation) algorithm: ◮ Input: m × n matrix A with m ≥ n , integers k > 0 and k < ℓ < n ◮ Steps: 1. Draw a random n × ℓ test matrix Ω . 2. Compute Y = AΩ – “ sketching ”. 3. Compute an orthonormal basis Q of Y . 4. Compute ℓ × n matrix B = Q T A . 5. Compute B k = the rank-truncated SVD of B . 6. Compute � A k = QB k . ◮ Output: � A k , a rank-k approximation of A . 5 / 15

Prototype randomized SVD algorithm MATLAB demo code: randsvd.m >> ... >> Omega = randn(n,l); >> C = A*Omega; >> Q = orth(C); >> [Ua,Sa,Va] = svd(Q’*A); >> Ak = (Q*Ua(:,1:k))*Sa(1:k,1:k)*Va(:,1:k)’; >> ... 6 / 15

Prototype randomized SVD algorithm ◮ Theorem. With proper choice of an m × O ( k/ǫ ) sketch Ω , � A − QX | 2 F ≤ (1 + ǫ ) � A − A k � 2 min 2 rank ( X ) ≤ k holds with high probability. ◮ Reading: Halko et al, SIAM Rev., 53:217-288, 2011. 7 / 15

Accelerated randomized SVD algorithm 1 The basic subspace iteration ◮ Input: m × n matrix A with m ≥ n , n × ℓ starting matrix Ω and positive integers k, ℓ, q and n > ℓ ≥ k . ◮ Steps: 1. Compute Y = ( AA T ) q AΩ . 2. Compute an orthonormal basis Q of Y . 3. Compute ℓ × n matrix B = Q T A . 4. Compute B k = the rank-truncated SVD of B . 5. Compute � A k = QB k . ◮ Output: � A k , a rank- k approximation of A . Remark: When k = ℓ = 1 . This is the classical power method. 8 / 15

Accelerated randomized SVD algorithm 2 Remarks on the basic subspace iteration: ◮ The orthonormal basis Q of Y = ( AA T ) q AΩ should be stably computed by the following loop: compute Y = AΩ compute Y = QR (QR decompostion) for j = 1 , 2 , . . . , q compute Y = A T Q compute Y = QR (QR decompostion) compute Y = AQ compute Y = QR (QR deompostion) ◮ Convergence results: Under mild assumption of the starting matrix Ω , (a) the basic subspace iteration converges as q → ∞ . �� σ ℓ +1 � 2 q +1 � (b) | σ j − σ j ( Q T B k ) | ≤ O σ k Reading: M. Gu, Subspace iteration randomization and singular value problems, arXiv:1408.2208, 2014 9 / 15

Accelerated randomized SVD algorithm 3 ◮ Input: m × n matrix A with m ≥ n , positive integers k, ℓ, q and n > ℓ > k . ◮ Steps: 1. Draw a random n × ℓ test matrix Ω . 2. Compute Y = ( AA T ) q AΩ – “ sketching ”. 3. Compute an orthogonal columns basis Q of Y . 4. Compute ℓ × n matrix B = Q T A . 5. Compute B k = the rank-truncated SVD of B . 6. Compute � A k = QB k . ◮ Output: � A k , a rank- k approximation of A . 10 / 15

Accelerated randomized SVD algorithm 4 MATLAB demo code: randsvd2.m >> ... >> Omega = randn(n,l); >> C = A*Omega; >> Q = orth(C); >> for i = 1:q >> C = A’*Q; >> Q = orth(C); >> C = A*Q; >> Q = orth(C); >> end >> [Ua2,Sa2,Va2] = svd(Q’*A); >> Ak2 = (Q*Ua2(:,1:k))*Sa2(1:k,1:k)*Va2(:,1:k)’; >> ... 11 / 15

The CUR decomposition The CUR decomposition: find an optimal intersection U such that A ≈ CUR, where C is the selected c columns of A , and R is the selected r rows of A . 12 / 15

The CUR decomposition Theorem. (a) � A − CC + A � ≤ � A − CX � for any X (b) � A − CC + AR + R � ≤ � A − CXR � for any X (c) U ∗ = argmin U � A − CUR � 2 F = C + AR + where � · � is a unitarily invariant norm. Remark: Let A = UΣV T is the SVD of an m × n matrix A with m ≥ n . Then the pseudo-inverse (also called generalized inverse) A + of A is given by A + = V Σ + UT , where Σ + = diag( σ + 1 , ... ) and σ + = 1 /σj if σj � = 0 , otherwise σ + = 0 . If A is j j of full column rank, then A + = ( AT A ) − 1 AT . In MATLAB, pinv(A) is a built-in function of compute the pseudo-inverse of A . 13 / 15

The CUR decomposition MATLAB demo code: randcur.m >> ... >> bound = n*log(n)/m; >> sampled_rows = find(rand(m,1) < bound); >> R = A(sampled_rows,:); >> sampled_cols = find(rand(n,1) < bound); >> C = A(:,sampled_cols); >> U = pinv(C)*A*pinv(R); >> ... 14 / 15

The CUR decomposition ◮ Theorem. With c = O ( k/ǫ ) columns and r = O ( k/ǫ ) rows selected by adapative sampling to for C and R , X � A − CXR | 2 F ≤ (1 + ǫ ) � A − A k � 2 min F holds in expectation. ◮ Reading: Boutsidis and Woodruff, STOC, pp.353-362, 2014 15 / 15

ECS231 Low-rank approximation revisited (Introduction to - PowerPoint PPT Presentation

ECS231 Low-rank approximation revisited (Introduction to Randomized Algorithms) May 23, 2019 1 / 15 Outline 1. Review: low-rank approximation 2. Prototype randomized SVD algorithm 3. Accelerated randomized SVD algorithms 4. CUR

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation

ECS231 PCA, revisited May 28, 2019 1 / 18 Outline 1. PCA for lossy data compression 2. PCA for

Low Rank Approximation Lecture 4 Daniel Kressner Chair for Numerical Algorithms and HPC

Low Rank Approximation Lecture 10 Daniel Kressner Chair for Numerical Algorithms and HPC

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

Low Rank Approximation Lecture 5 Daniel Kressner Chair for Numerical Algorithms and HPC

Hom and Ext, Revisited Justin Lyle Lawrence, KS justin.lyle@ku.edu April 28, 2018 JL Hom and

6. Approximation and fitting norm approximation least-norm problems regularized

Low Rank Approximation Lecture 3 Daniel Kressner Chair for Numerical Algorithms and HPC

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

Predictive low-rank decomposition for kernel methods Francis Bach Michael Jordan Ecole des

ECS231 Least-squares problems (Introduction to Randomized Algorithms) May 21, 2019 1 / 12

ECS231 Mathematics Review I: Linear Algebra Reference: Chap.1 of Solomon 1 / 23 Vector spaces

ECS231 Intro to High Performance Computing April 13, 2019 1 / 33 Algorithm design and

Matrices A brief introduction Basilio Bona DAUIN Politecnico di Torino September 2013

Solving Matrix-Vector Equations Eric Eager Data Scientist at Pro Football Focus DataCamp

Partial isometries and pseudoinverses in semi-Hilbertian spaces Mar a Celeste Gonzalez

Characterizing Endpoints of Generalized Inverse Limits Lori Alvin Bradley University Nipissing

2 / 16 Preliminaries Notaion and setup of Problem Notaion Some results Notaion Remark to

Network Calculus: Reference Material: J.-Y. LeBoudec and Patrick Thiran: Network

Interactions If a model contains terms u U + v V then UV interaction term is uv UV

Fractional-Tikhonov regularization on graphs (applied to signal and image restoration) by Davide