Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and - PowerPoint PPT Presentation

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and and SPSD SPSD Ma Matri trix Ap Approximati tion on Shusen Wang

Outline • CX Decomposition & Approximate SVD • CUR Decomposition • SPSD Matrix Approximation

CX Decomposition • Given any matrix 𝐁 ∈ ℝ $×& • The CX decomposition of 𝐁 1. Sketching: 𝐃 = 𝐁𝐐 ∈ ℝ $×* 2. Find 𝐘 such that 𝐁 ≈ 𝐃𝐘 6 = 𝐃 7 𝐁 • E.g. 𝐘 ⋆ = argmin 𝐘 𝐁 − 𝐃𝐘 5 • It costs 𝑃 𝑛𝑜𝑑 • CX decomposition ⇔ approximate SVD J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G

CX Decomposition • Let the sketching matrix 𝐐 ∈ ℝ &×* be defined in the table. 6 ≤ 1 + 𝜗 𝐁 − 𝐁 \ 6 WXYZ 𝐘 [\ 𝐁 − 𝐃𝐘 min • ] ] Uniform sampling Leverage score Gaussian SRHT Count sketch sampling projection O 𝑙 O 𝑙 6 + 𝑙 c ≥ O 𝜉𝑙 log 𝑙 + 1 O 𝑙 log 𝑙 + 1 log 𝑙 + 1 O 𝑙 + log 𝑜 𝜗 𝜗 𝜗 𝜗 𝜗 𝜉 is the column coherence of 𝐁 Z

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G J ∈ ℝ $×* SVD: 𝐃 = 𝐕 G 𝚻 G 𝐖 G Time cost : 𝑃(𝑛𝑑 6 )

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G J 𝐘 = 𝐚 ∈ ℝ *×& Let 𝚻 G 𝐖 G J ∈ ℝ $×* SVD: 𝐃 = 𝐕 G 𝚻 G 𝐖 G Time cost : 𝑃(𝑛𝑑 6 + 𝑜𝑑 6 )

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G J 𝐘 = 𝐚 ∈ ℝ *×& Let 𝚻 G 𝐖 G J ∈ ℝ $×* SVD: 𝐃 = 𝐕 G 𝚻 G 𝐖 G J ∈ ℝ *×& SVD: 𝐚 = 𝐕 L 𝚻 L 𝐖 L Time cost : 𝑃(𝑛𝑑 6 + 𝑜𝑑 6 + 𝑜𝑑 6 )

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD 𝑛×𝑡 matrix with orthonormal columns J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝑡×𝑜 matrix with 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G orthonormal rows J 𝐘 = 𝐚 ∈ ℝ *×& Let 𝚻 G 𝐖 G diagonal matrix J ∈ ℝ $×* SVD: 𝐃 = 𝐕 G 𝚻 G 𝐖 G J ∈ ℝ *×& SVD: 𝐚 = 𝐕 L 𝚻 L 𝐖 L Time cost : 𝑃(𝑛𝑑 6 + 𝑜𝑑 6 + 𝑜𝑑 6 + 𝑛𝑑 6 )

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD 𝑛×𝑡 matrix with • Done! Approximate rank 𝑑 SVD: 𝐁 ≈ (𝐕 G 𝐕 L )𝚻 L 𝐖 L J orthonormal columns J 𝐘 = 𝐕 G 𝐚 = 𝐕 G 𝐕 L 𝚻 L 𝐖 L J 𝑡×𝑜 matrix with 𝐁 ≈ 𝐃𝐘 = 𝐕 G 𝚻 G 𝐖 G orthonormal rows diagonal matrix Time cost : 𝑃 𝑛𝑑 6 + 𝑜𝑑 6 + 𝑜𝑑 6 + 𝑛𝑑 6 = 𝑃(𝑛𝑑 6 + 𝑜𝑑 6 )

CX Decomposition ⇔ Approximate SVD • CX decomposition ⇔ approximate SVD • Given 𝐁 ∈ ℝ $×& and 𝐃 ∈ ℝ $×* , the approximate SVD costs • 𝑃 𝑛𝑜𝑑 time • 𝑃 𝑛𝑑 + 𝑜𝑑 memory

CX Decomposition • The CX decomposition of 𝐁 ∈ ℝ $×& 6 = 𝐃 7 𝐁 • Optimal solution: 𝐘 ⋆ = argmin 𝐘 𝐁 − 𝐃𝐘 5 • How to make it more efficient?

CX Decomposition • The CX decomposition of 𝐁 ∈ ℝ $×& 6 = 𝐃 7 𝐁 • Optimal solution: 𝐘 ⋆ = argmin 𝐘 𝐁 − 𝐃𝐘 5 • How to make it more efficient? A regression problem!

Fast CX Decomposition • Fast CX [Drineas, Mahoney, Muthukrishnan, 2008][Clarkson & Woodruff, 2013] • Draw another sketching matrix 𝐓 ∈ ℝ $×m 6 = 𝐓 J 𝐃 7 𝐓 J 𝐁 n = argmin 𝐘 𝐓 o 𝐁 − 𝐃𝐘 • Compute 𝐘 5 • Time cost: 𝑃 𝑜𝑑𝑡 + TimeOfSketch q 𝑑/𝜗 , • When 𝑡 = 𝑃 6 6 n 𝐁 − 𝐃𝐘 ≤ 1 + 𝜗 ⋅ min 𝐘 𝐁 − 𝐃𝐘 5 5

CUR Decomposition • Sketching • 𝐃 = 𝐁𝐐 𝐃 ∈ ℝ $×* J 𝐁 ∈ ℝ v×& • 𝐒 = 𝐐 𝐒 • Find 𝐕 such that 𝐃𝐕𝐒 ≈ 𝐁 • CUR ⇔ Approximate SVD • In the same way as “ CX ⇔ Approximate SVD ”

CUR Decomposition • Sketching • 𝐃 = 𝐁𝐐 𝐃 ∈ ℝ $×* J 𝐁 ∈ ℝ v×& • 𝐒 = 𝐐 𝐒 • Find 𝐕 such that 𝐃𝐕𝐒 ≈ 𝐁 • CUR ⇔ Approximate SVD • In the same way as “ CX ⇔ Approximate SVD ” • 3 types of 𝐕

CUR Decomposition • Type 1 [Drineas, Mahoney, Muthukrishnan, 2008] : 7 o 𝐁𝐐 𝐃 𝐕 = 𝐐 𝐒 𝐒 𝐁 𝐃 𝐕

CUR Decomposition • Type 1 [Drineas, Mahoney, Muthukrishnan, 2008] : 7 o 𝐁𝐐 𝐃 𝐕 = 𝐐 𝐒 • Recall the fast CX decomposition 7 𝐐 𝐒 n = 𝐃 𝐐 𝐒 o 𝐃 o 𝐁 = 𝐃𝐕𝐒 𝐁 ≈ 𝐃𝐘

CUR Decomposition • Type 1 [Drineas, Mahoney, Muthukrishnan, 2008] : 7 o 𝐁𝐐 𝐃 𝐕 = 𝐐 𝐒 • Recall the fast CX decomposition 7 𝐐 𝐒 n = 𝐃 𝐐 𝐒 o 𝐃 o 𝐁 = 𝐃𝐕𝐒 𝐁 ≈ 𝐃𝐘 n = 𝐃 𝐕 𝐒 • They’re equivalent: 𝐃 𝐘

CUR Decomposition • Type 1 [Drineas, Mahoney, Muthukrishnan, 2008] : 7 o 𝐁𝐐 𝐃 𝐕 = 𝐐 𝐒 • Recall the fast CX decomposition 7 𝐐 𝐒 n = 𝐃 𝐐 𝐒 o 𝐃 o 𝐁 = 𝐃𝐕𝐒 𝐁 ≈ 𝐃𝐘 n = 𝐃 𝐕 𝐒 • They’re equivalent: 𝐃 𝐘 \ * q q • Require 𝑑 = 𝑃 w and 𝑠 = 𝑃 w such that 6 ≤ 1 + 𝜗 𝐁 − 𝐁 \ 6 𝐁 − 𝐃𝐕𝐒 5 5

CUR Decomposition • Type 1 [Drineas, Mahoney, Muthukrishnan, 2008] : 7 𝐔 𝐁𝐐 𝐃 𝐕 = 𝐐 𝐒 • Efficient • O 𝑠𝑑 6 + TimeOfSketch • Loose bound • Sketch size ∝ 𝜗 {6 • Bad empirical performance

CUR Decomposition • Type 2: Optimal CUR 6 = 𝐃 7 𝐁𝐒 7 𝐕 ⋆ = min 𝐁 − 𝐃𝐕𝐒 ] 𝐕

CUR Decomposition • Type 2: Optimal CUR 6 = 𝐃 7 𝐁𝐒 7 𝐕 ⋆ = min 𝐁 − 𝐃𝐕𝐒 ] 𝐕 • Theory [W & Zhang, 2013], [Boutsidis & Woodruff, 2014] : • 𝐃 and 𝐒 are selected by the adaptive sampling algorithm \ \ • 𝑑 = 𝑃 w and 𝑠 = 𝑃 w 6 ≤ 1 + 𝜗 𝐁 − 𝐁 \ 6 𝐁 − 𝐃𝐕𝐒 • ] 5

CUR Decomposition • Type 2: Optimal CUR 6 = 𝐃 7 𝐁𝐒 7 𝐕 ⋆ = min 𝐁 − 𝐃𝐕𝐒 ] 𝐕 • Inefficient • O 𝑛𝑜𝑑 + TimeOfSketch

CUR Decomposition • Type 3: Fast CUR [W, Zhang, Zhang, 2015] • Draw 2 sketching matrices 𝐓 𝐃 and 𝐓 𝐒 • Solve the problem 7 𝐓 𝐃 6 o 𝐁 − 𝐃𝐕𝐒 𝐓 𝐒 n = min J 𝐃 o 𝐁𝐓 𝑺 𝐒𝐓 𝐒 7 𝐕 𝑻 𝑫 = 𝐓 𝐃 𝐕 ] • Intuition?

CUR Decomposition • The optimal 𝐕 matrix is obtained by the optimization problem 6 𝐕 ⋆ = min 𝐃𝐕𝐒 − 𝐁 ] 𝐕

CUR Decomposition • Approximately solve the optimization problem, e.g. by column selection

CUR Decomposition • Solve the small scale problem

CUR Decomposition • Type 3: Fast CUR [W, Zhang, Zhang, 2015] • Draw 2 sketching matrices 𝐓 𝐃 ∈ ℝ $×m € and 𝐓 𝐒 ∈ ℝ &×m • • Solve the problem 7 𝐓 𝐃 6 o 𝐁 − 𝐃𝐕𝐒 𝐓 𝐒 n = min J 𝐃 o 𝐁𝐓 𝑺 𝐒𝐓 𝐒 7 𝐕 𝐓 𝑫 = 𝐓 𝐃 𝐕 ] • Theory * v • 𝑡 * = 𝑃 w and s v = 𝑃 w 6 6 n𝐒 𝐁 − 𝐃𝐕 ≤ 1 + 𝜗 ⋅ min 𝐁 − 𝐃𝐕𝐒 • ] 𝐕 ]

CUR Decomposition • Type 3: Fast CUR [W, Zhang, Zhang, 2015] • Draw 2 sketching matrices 𝐓 𝐃 ∈ ℝ $×m € and 𝐓 𝐒 ∈ ℝ &×m • • Solve the problem 7 𝐓 𝐃 6 o 𝐁 − 𝐃𝐕𝐒 𝐓 𝐒 n = min J 𝐃 o 𝐁𝐓 𝑺 𝐒𝐓 𝐒 7 𝐕 𝐓 𝑫 = 𝐓 𝐃 𝐕 ] • Efficient • 𝑃 𝑡 * 𝑡 v 𝑑 + 𝑠 + TimeOfSketch • Good empirical performance

𝐁 : 𝑛 = 1920 𝑜 = 1168 𝐃 and 𝐒 : 𝑑 = 𝑠 = 100 • uniform sampling • Original Type 2: Optimal CUR Type 1: Fast CX Type 3: Fast CUR Type 3: Fast CUR 𝑡 * = 2𝑑, 𝑡 v = 2𝑠 𝑡 * = 4𝑑, 𝑡 v = 4𝑠

Conclusions • Approximate truncated SVD • CX decomposition • CUR decomposition (3 types) • Fast CUR is the best

Motivation 1: Kernel Matrix • Given 𝑜 samples 𝐲 ‰ , ⋯ , 𝐲 & ∈ ℝ ‹ and kernel function 𝜆 ⋅,⋅ . • E.g. Gaussian RBF kernel 6 𝐲 • − 𝐲 Ž 6 𝜆 𝐲 • , 𝐲 Ž = exp − . 𝜏 6 • Computing the kernel matrix 𝐋 ∈ ℝ &×& • where 𝑙 •Ž = 𝜆 𝐲 • , 𝐲 Ž • costs O(𝑜 6 𝑒) time

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and - PowerPoint PPT Presentation

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and and SPSD SPSD Ma Matri trix Ap Approximati tion on Shusen Wang Outline CX Decomposition & Approximate SVD CUR Decomposition SPSD Matrix Approximation CX

SVD Status H. Yin August 24, 2017 H. Yin SVD Status August 24, 2017 1 / 19 Overview SVD

Invest estig igation ions on on Decom ompos position ition Cha haracterist acteristics

CONNECTING CAPABILITY WITH OPPORTUNITY Decom North Sea Transforming Decommissioning Planning

ECS231 Low-rank approximation revisited (Introduction to Randomized Algorithms) May 23, 2019

A study for hit-time reconstruction of Belle II SVD Yuma Uematsu (UTokyo) on behalf of Belle II

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate

Parallel decom position of Mueller m atrices and polarim etric subtraction Jos J. Gil

Computer+Vision Cameras Prof.&Flvio&Cardeal& DECOM&/&CEFET7MG

Computer Vision Advanced Edge Detectors Prof. Flvio Cardeal DECOM /

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 2019 Image: Wikipedia Roadmap

Big Data Management & Analytics EXERCISE 9 SVD, CUR 11th of January, 2016 Sabrina

Partial Lanczos SVD methods for R Bryan Lewis 1 , adapted from the work of Jim Baglama 2 and Lothar

The Great SVD Mystery James H. Steiger Department of Psychology and Human Development Vanderbilt

1 Low-rank approximations to a matrix using SVD First point: we can write the SVD as a sum of

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Subspace Embeddings and p -Regression Using Exponential Random Variables David P. Woodruff

The Game Development Process Audio Creation Introduction (1 of 2) Dramatic evolution of

Fisher scoring for some univariate discrete distributions Thomas Yee University of Auckland 26

Track Layout is Hard Michael J. Bannister 1 William E. Devanny 2 c 3 Vida Dujmovi David Eppstein

Te Testing sting th the e Bo Boolean lean Ran Rank Michal Parnas Joint work with: Dana

dynamics? Chris Woodruff, University of Warwick UNU Wider 30 th Anniversary Conference Helsinki

Regulatory Investigations Reduce the Costs, Minimize the Risks Sponsored By: Regulatory

Ohio Grocers Association and Ohio Grocers Foundation Diversion and Composting Initiative Ohio

Sambuz

Useful Links

Newsletter

Mail Us

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and - PowerPoint PPT Presentation

Ra Randomized SV SVD, CU CUR De Decom ompos osition on, and and SPSD SPSD Ma Matri trix Ap Approximati tion on Shusen Wang Outline CX Decomposition & Approximate SVD CUR Decomposition SPSD Matrix Approximation CX

SVD Status H. Yin August 24, 2017 H. Yin SVD Status August 24, 2017 1 / 19 Overview SVD

Invest estig igation ions on on Decom ompos position ition Cha haracterist acteristics

CONNECTING CAPABILITY WITH OPPORTUNITY Decom North Sea Transforming Decommissioning Planning

ECS231 Low-rank approximation revisited (Introduction to Randomized Algorithms) May 23, 2019

A study for hit-time reconstruction of Belle II SVD Yuma Uematsu (UTokyo) on behalf of Belle II

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate

Parallel decom position of Mueller m atrices and polarim etric subtraction Jos J. Gil

Computer+Vision Cameras Prof.&amp;Flvio&amp;Cardeal&amp; DECOM&amp;/&amp;CEFET7MG

Computer Vision Advanced Edge Detectors Prof. Flvio Cardeal DECOM /

Randomized Algorithms Randomized Algorithms Two Types of Randomized Algorithms Two Types of

(fast) Randomized SVD Ryan Levy, Algorithm Interest Group, Jan. 31 2019 Image: Wikipedia Roadmap

Big Data Management &amp; Analytics EXERCISE 9 SVD, CUR 11th of January, 2016 Sabrina

Partial Lanczos SVD methods for R Bryan Lewis 1 , adapted from the work of Jim Baglama 2 and Lothar

The Great SVD Mystery James H. Steiger Department of Psychology and Human Development Vanderbilt

1 Low-rank approximations to a matrix using SVD First point: we can write the SVD as a sum of

SVD- -based Functional ANOVA For based Functional ANOVA For SVD Measurement Evaluation of

Subspace Embeddings and p -Regression Using Exponential Random Variables David P. Woodruff

The Game Development Process Audio Creation Introduction (1 of 2) Dramatic evolution of

Fisher scoring for some univariate discrete distributions Thomas Yee University of Auckland 26

Track Layout is Hard Michael J. Bannister 1 William E. Devanny 2 c 3 Vida Dujmovi David Eppstein

Te Testing sting th the e Bo Boolean lean Ran Rank Michal Parnas Joint work with: Dana

dynamics? Chris Woodruff, University of Warwick UNU Wider 30 th Anniversary Conference Helsinki

Regulatory Investigations Reduce the Costs, Minimize the Risks Sponsored By: Regulatory

Ohio Grocers Association and Ohio Grocers Foundation Diversion and Composting Initiative Ohio

Sambuz

Useful Links

Newsletter

Mail Us

Computer+Vision Cameras Prof.&Flvio&Cardeal& DECOM&/&CEFET7MG

Big Data Management & Analytics EXERCISE 9 SVD, CUR 11th of January, 2016 Sabrina