(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 - PowerPoint PPT Presentation

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 2019 Image: Wikipedia

Roadmap • Review SVD • It’s awesome - why you should love it • Singular values are almost math magic • Bottleneck Scenarios – the need for stochastic methods • Randomized SVD algorithms • Easy • Improvements • Pictures

SVD Review That trick you learned in math class! Eigendecomposition of a matrix is powerful, but matrix must be square Þ Generalize to SVD Any Singular Matrix Values U,V are unitary If M is square the eigenvectors can be U,V SVD can have a geometric interpretation for some M Can approximate M by reducing singular values

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> Example 1   0 1 0 1 1 2 0 . 32 0 . 88 0 . 41 9 . 5 0 ✓ ◆ 0 . 62 − 0 . 78 M = 3 4 = 0 . 52 0 . 24 − 0 . 82 0 0 . 51   @ A @ A 0 . 78 0 . 62 5 6 0 . 82 − 0 . 4 0 . 41 0 0 V † U Σ

Example 2

Key: [% Σ =0] – [# remaining] Example 2 50% - 298 90% - 60 75% - 149 95% - 30

Where do we see SVDs in physics? • Principal Component Analysis (PCA) • Look at dominate principal components – large singular values – to analyze multi dimension problem • Easier linear algebra (matrix exponential, approximating data, etc) • Clustering problems (similar to PCA) • Calculating Entanglement Entropy • Schmidt Decomposition • Pseudo-inverse Image: Wikipedia, doi:10.1038/nature15750

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> SVD Bottlenecks – Full SVD Algorithm ∼ O ( mn 2 ) Large matrices take huge computational cost Sometimes have hundreds of large matrices to SVD (e.g. facebook) “…the adjacency matrix of Facebook users to Facebook pages induced by likes, with size O(10 ⁹ ) × O(10 ⁸ ) ” Source: Facebook research ∼ O ( m ) Passes through matrix

SVD Algorithm ~complicated~

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> Method 1 – Power Method Pro: Physicists know how to do this! • Lanczos(!) Parallelizes very well • Cons: Hard to find many principal • components Large degeneracies will slow down • 1. Notice that an SVD is the same as convergence M = U Σ V † ⇔ Av = Ev Larger storage/GMM cost • ✓ 0 ◆ ✓ U ◆ ✓ U ◆ M = Σ ii M † 0 V V 2. Notice that solving an eigenvalue problem is the same as M N x ! Ex, N � 1 3. Start with a random vector then apply the Hamiltonian, x normalizing after each step

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> “Easy” Randomized SVD Goal: obtain SVD for k singular values of a m x n matrix M, assuming m > n 1. Create a n x k matrix of random [normal] samples Ω Random values are hopefully superposition of correct basis 2. Do a QR decomposition on the sample matrix Ω M vectors a. Reminder that QR = (orthogonal matrix) (upper triangular) b. QR is slow but accurate “Randomized Range Finder” c. Orthogonal matrix Q is m x k 3. Create “smaller” k x n matrix B = Q † M 4. Do SVD on B = u Σ V † 5. Get original U = Q u Source: Halko, Martinsson and Tropp (2009)

Actual k =100 Visual Example rSVD k =100 Thanks to smortezavi’s example code

Actual k =10 Visual Example rSVD k =10 Thanks to smortezavi’s example code

Comments on Randomized SVD • By using certain random sample matrices we can speed up the algorithm and form less intermediate matrices • How good can we do? • Bounded by error of using a k rank matrix • Can sample several times to get another error estimate • Con - Assumes singular values decay slowly • What if we use the Lanzcos idea and project into the M subspace? • Best: Combine both techniques!

<latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> <latexit sha1_base64="(nul)">(nul)</latexit> Improve Range Suspace Power Method Ω M → ( MM † ) q Ω M Rounding error problem Instead do QR every step, alternate M, Mt

Actual k =100 Visual Example rSVD k =100, q=1

Visual Example Timing (s): rSVD k =10, q=1 Actual k =10 SVD: 0.613 rSVD q=0: 0.0096 rSVD q=1: 0.022 rSVD k =10, q=5 rSVD k =10, q=0

Visual Example rSVD k =10, q=20 Actual k =10 Unstable rSVD k =10, q=20 rSVD k =10, q=0

Conclusions • SVD is a powerful technique but slow for large matrices • Because we don’t always need all the singular values we can guess how many we need to make a faster algorithm • Randomized SVD estimates smaller subspace to perform a full SVD • Can be sped up by using smart random sampling • Can be improved by using a power method or oversampling Thanks !

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 - PowerPoint PPT Presentation

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 2019 Image: Wikipedia Roadmap Review SVD Its awesome - why you should love it Singular values are almost math magic Bottleneck Scenarios the need for

SVD Status H. Yin August 24, 2017 H. Yin SVD Status August 24, 2017 1 / 19 Overview SVD

A study for hit-time reconstruction of Belle II SVD Yuma Uematsu (UTokyo) on behalf of Belle II

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate

ECS231 Low-rank approximation revisited (Introduction to Randomized Algorithms) May 23, 2019

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and and SPSD SPSD Ma Matri trix

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Cooling pipes, heat management, temperature, humidity control of VXD Overview SVD PXD 2

Partial Lanczos SVD methods for R Bryan Lewis 1 , adapted from the work of Jim Baglama 2 and Lothar

1 Low-rank approximations to a matrix using SVD First point: we can write the SVD as a sum of

The Great SVD Mystery James H. Steiger Department of Psychology and Human Development Vanderbilt

GPU Parallel Implementation of The Approximate K-SVD Algorithm Using OpenCL Paul Irofti 1 Bogdan

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah & Karan Singh 1 Randomized

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

S9226 Fast singular value decomposition on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Samuel

Mina Kwon - 2019.10.15 - Randomized Clinical Trial ? Randomized Clinical Trial Purpose:

L ECTURE 4: D YNAMICAL S YSTEMS 3 I NSTRUCTOR : G IANNI A. D I C ARO EQUILIBRIUM A state

Math 211 Math 211 Lecture #13 October 10, 2000 2 Square Matrices Square Matrices There

Math 211 Math 211 Lecture #21 Determinants October 16, 2002 2 Basis of a Subspace Basis of a

Outline Feedback Background CS1007: Object Oriented Design Arrays Scopes and

Polynomial optimization on the sphere and quantum entanglement testing Kun Fang Joint work with

GGH15 beyond permutation branching programs proofs, attacks, and candidates Yilei Chen, Vinod

Egyptian Numerals Egyptian number system is additive. Mesopotamia Civilization Above:

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 - PowerPoint PPT Presentation

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 2019 Image: Wikipedia Roadmap Review SVD Its awesome - why you should love it Singular values are almost math magic Bottleneck Scenarios the need for

SVD Status H. Yin August 24, 2017 H. Yin SVD Status August 24, 2017 1 / 19 Overview SVD

A study for hit-time reconstruction of Belle II SVD Yuma Uematsu (UTokyo) on behalf of Belle II

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate

ECS231 Low-rank approximation revisited (Introduction to Randomized Algorithms) May 23, 2019

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and and SPSD SPSD Ma Matri trix

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Cooling pipes, heat management, temperature, humidity control of VXD Overview SVD PXD 2

Partial Lanczos SVD methods for R Bryan Lewis 1 , adapted from the work of Jim Baglama 2 and Lothar

1 Low-rank approximations to a matrix using SVD First point: we can write the SVD as a sum of

The Great SVD Mystery James H. Steiger Department of Psychology and Human Development Vanderbilt

GPU Parallel Implementation of The Approximate K-SVD Algorithm Using OpenCL Paul Irofti 1 Bogdan

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah &amp; Karan Singh 1 Randomized

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

S9226 Fast singular value decomposition on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Samuel

Mina Kwon - 2019.10.15 - Randomized Clinical Trial ? Randomized Clinical Trial Purpose:

L ECTURE 4: D YNAMICAL S YSTEMS 3 I NSTRUCTOR : G IANNI A. D I C ARO EQUILIBRIUM A state

Math 211 Math 211 Lecture #13 October 10, 2000 2 Square Matrices Square Matrices There

Math 211 Math 211 Lecture #21 Determinants October 16, 2002 2 Basis of a Subspace Basis of a

Outline Feedback Background CS1007: Object Oriented Design Arrays Scopes and

Polynomial optimization on the sphere and quantum entanglement testing Kun Fang Joint work with

GGH15 beyond permutation branching programs proofs, attacks, and candidates Yilei Chen, Vinod

Egyptian Numerals Egyptian number system is additive. Mesopotamia Civilization Above:

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

CSC373 Week 11: Randomized Algorithms 373F19 - Nisarg Shah & Karan Singh 1 Randomized