Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen - PowerPoint PPT Presentation

Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen Electrical Engineering, Princeton University

Spectral methods based on eigen-decomposition = E [ M ] + M − E [ M ] M � �� approx. low-rank Methods based on eigen-decomposition of a certain data matrix M ... 2/ 42

Spectral methods based on eigen-decomposition = E [ M ] + M − E [ M ] M � �� approx. low-rank Methods based on eigen-decomposition of a certain data matrix M ... This talk: what happens if data matrix M is non-symmetric? — 2 recent stories 2/ 42

Asymmetry helps: eigenvalue and eigenvector analyses of asymmetrically perturbed low-rank matrices Jianqing Fan Chen Cheng Princeton ORFE Stanford Stats

Eigenvalue / eigenvector estimation M ⋆ : truth • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n 4/ 42

Eigenvalue / eigenvector estimation + M ⋆ : truth H : noise • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n • Observed noisy data: M = M ⋆ + H 4/ 42

Eigenvalue / eigenvector estimation + M ⋆ : truth H : noise • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n • Observed noisy data: M = M ⋆ + H • Goal: estimate eigenvalue λ ⋆ and eigenvector u ⋆ 4/ 42

Non-symmetric noise matrix + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix This may arise when, e.g., we have 2 samples for each entry of M ⋆ and arrange them in an asymmetric manner 5/ 42

A natural estimation strategy: SVD + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix • Use leading singular value λ svd of M to estimate λ ⋆ • Use leading left singular vector of M to estimate u ⋆ 6/ 42

A less popular strategy: eigen-decomposition + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix • Use leading singular value λ svd eigenvalue λ eigs of M to estimate λ ⋆ • Use leading singular vector eigenvector of M to estimate u ⋆ 7/ 42

SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD > eigen-decomposition 8/ 42

SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD > eigen-decomposition • (Folklore?) Statistical accuracy SVD ≍ eigen-decomposition 8/ 42

SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD > eigen-decomposition • (Folklore?) Statistical accuracy SVD ≍ eigen-decomposition Shall we always prefer SVD over eigen-decomposition? 8/ 42

A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ { H i,j } : i.i.d. N (0 , σ 2 ) , σ = 1 + H ; √ n log n � �� M ⋆ 10 0 SVD � λ svd − λ ? � � � j 6 ! 6 ? j 10 -1 10 -2 200 400 600 800 1000 1200 1400 1600 1800 2000 n 9/ 42

A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ 1 { H i,j } : i.i.d. N (0 , σ 2 ) , σ = + H ; √ n log n � �� M ⋆ 10 0 10 0 SVD Eigen-Decomposition SVD � λ svd − λ ? � � � j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 400 600 800 1000 1200 1400 1600 1800 2000 � n n n 9/ 42

A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ 1 { H i,j } : i.i.d. N (0 , σ 2 ) , σ = + H ; √ n log n � �� M ⋆ 10 0 10 0 10 0 SVD Eigen-Decomposition Eigen-Decomposition SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 1600 1600 1800 1800 2000 2000 � n n n n 9/ 42

A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ 1 { H i,j } : i.i.d. N (0 , σ 2 ) , σ = + H ; √ n log n � �� M ⋆ 10 0 10 0 10 0 SVD Eigen-Decomposition Eigen-Decomposition SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 1600 1600 1800 1800 2000 2000 � n n n n � � λ eigs − λ ⋆ � � � λ svd − λ ⋆ � � ≈ 2 . 5 � empirically, √ n 9/ 42

j 6 ! 6 ? j n Another numerical experiment: matrix completion � 1 p M ⋆ with prob. p, M ⋆ = u ⋆ u ⋆ ⊤ ; i,j p = 3 log n M i,j = n 0 , else ,   � ? ? ? � ?   ? ? ? ? � �     � ? ? � ? ?      ? ? � ? ? �      � ? ? ? ? ? ? � ? ? � ? 10/ 42

Another numerical experiment: matrix completion � 1 p M ⋆ with prob. p, M ⋆ = u ⋆ u ⋆ ⊤ ; i,j p = 3 log n M i,j = n 0 , else , 10 0 Eigen-Decomposition SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 � n n � � λ eigs − λ ⋆ � � � λ svd − λ ⋆ � � ≈ 2 . 5 � empirically, √ n 10/ 42

Why does eigen-decomposition work so much better than SVD?

Problem setup M = u ⋆ u ⋆ ⊤ + H ∈ R n × n � �� M ⋆ • H : noise matrix ◦ independent entries: { H i,j } are independent ◦ zero mean: E [ H i,j ] = 0 ◦ variance: Var ( H i,j ) ≤ σ 2 ◦ magnitudes: P {| H i,j | ≥ B } � n − 12 12/ 42

Problem setup M = u ⋆ u ⋆ ⊤ + H ∈ R n × n � �� M ⋆ • H : noise matrix ◦ independent entries: { H i,j } are independent ◦ zero mean: E [ H i,j ] = 0 4 ◦ variance: Var ( H i,j ) ≤ σ 2 ◦ magnitudes: P {| H i,j | ≥ B } � n − 12 e i • M ⋆ obeys incoherence condition � µ � i u ⋆ � k e > i U ? k 2 � e ⊤ � ≤ max n 1 ≤ i ≤ n U ? 12/ 42

Classical linear algebra results � � λ svd − λ ⋆ � � ≤ � H � ( Weyl ) � � λ eigs − λ ⋆ � � ≤ � H � ( Bauer-Fike ) 13/ 42

Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � � λ eigs − λ ⋆ � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � � � λ svd − λ ⋆ � � � σ n log n + B log n � � � λ eigs − λ ⋆ � � � σ n log n + B log n 13/ 42

Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � � λ eigs − λ ⋆ � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � � � λ svd − λ ⋆ � � � σ ( reasonably tight if � H � is large ) n log n + B log n � � � λ eigs − λ ⋆ � � � σ n log n + B log n 13/ 42

Classical linear algebra results � � λ svd − λ ⋆ � � ≤ � H � ( Weyl ) � � λ eigs − λ ⋆ � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � � λ svd − λ ⋆ � � � � σ ( reasonably tight if � H � is large ) n log n + B log n � � � λ eigs − λ ⋆ � � � σ n log n + B log n ( can be significantly improved ) 13/ 42

j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j n n n Main results: eigenvalue perturbation Theorem 1 (Chen, Cheng, Fan ’18) With high prob., leading eigenvalue λ eigs of M obeys � µ � � λ eigs − λ ⋆ � � � σ � � � n log n + B log n n 14/ 42

Main results: eigenvalue perturbation Theorem 1 (Chen, Cheng, Fan ’18) With high prob., leading eigenvalue λ eigs of M obeys � µ � � λ eigs − λ ⋆ � � � σ � � � n log n + B log n n 10 0 10 0 10 0 Eigen-Decomposition Eigen-Decomposition SVD SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � � 2 . 5 p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � � . 5 200 200 200 400 400 400 600 600 600 800 800 800 1000 1000 1000 1200 1200 1200 1400 1400 1400 1600 1600 1600 1800 1800 1800 2000 2000 2000 � n n n n � n • Eigen-decomposition is µ times better than SVD! � � λ svd − λ ⋆ � � � σ √ n log n + B log n — recall 14/ 42

min fk u ! u ? k 1 ; k u + u ? k 1 g n Main results: entrywise eigenvector perturbation Theorem 2 (Chen, Cheng, Fan ’18) With high prob., leading eigenvector u of M obeys � µ � � u ± u ⋆ � � � σ � � min ∞ � n log n + B log n n 15/ 42

min fk u ! u ? k 1 ; k u + u ? k 1 g n Main results: entrywise eigenvector perturbation Theorem 2 (Chen, Cheng, Fan ’18) With high prob., leading eigenvector u of M obeys � µ � � � u ± u ⋆ � � σ � � min ∞ � n log n + B log n n � � λ ⋆ � � , then • if � H � ≪ � � u ± u ⋆ � � � u ⋆ � � � min 2 ≪ ( classical bound ) 2 15/ 42

Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen - PowerPoint PPT Presentation

Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen Electrical Engineering, Princeton University Spectral methods based on eigen-decomposition = E [ M ] + M E [ M ] M approx. low-rank Methods based on

L-102.00 1 Building Outline Map L-102.00 SCALE: 1" = 10' Date: 7/17/2017 Date:

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

Rosalind Dibley: Digital Stories and Me Sharing stories Information about how to tell stories is

Particle Hole Asymmetry in the Pseudogap Particle-Hole Asymmetry in the Pseudogap Phase of the

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

West 108th Street Development West Side Federation for Senior & Supportive Housing Dattner

Spectral and High-Order Methods Spectral and High-Order Methods for Shock-Induced Mixing for

Spectral Graph Theory and its Applications Lillian Dai 6.454 Oct. 20, 2004 1 Outline Basic

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Checki king in and Treating High-Achievi ving Students Meet Meet you your r Doctor Doctor

10Hz Spectral Lines Joschua Dilly 10Hz Spectral Lines 2 Introduction Ions 50cm Protons 30cm

AIRS In-flight Spectral Calibration Steve Gaiser 1 Steve Gaiser, AIRS in-orbit spectral

Lesson 9 Introduction Signal Spectral Analysis: Estimation of the power spectral density

Poster #190 1 Spectral Clustering of Signed Graphs Poster #190 Our Goal: Extend Spectral

Future Scenarios for the Waikato Scenarios Four stories about the future These are not

Edmund Coleman-Fountain Janice McLaughlin Stories and Practices Relationship between practices

Comparison of Surgical and Nonsurgical Options for Management of Nonspecific Chronic Low Back

Pr Preliminary comparison re results University # graduates of # graduates # filled # valid

Whats the Difference? Comparison of the Army MEC Risk Management Method and the MEC HA Method

Improving CG-CAHPS Scores in a Federally Qualified Health Center Presen ented B ed By Debra R

Comparison Inequalities and Fastest-Mixing Markov Chains ( Annals of Applied Probability , to

Web Site Design and Development JavaScript CS 0134 Fall 2018 Tues and Thurs 1:00 2:15PM

Internet Service Quality Gabor Molnar University of Colorado, Boulder Work with Scott Savage

Evaluating the effect of Network Neutrality on content provider business models Maurizio Naldi,

Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen - PowerPoint PPT Presentation

Spectral Methods Meet Asymmetry: Two Recent Stories Yuxin Chen Electrical Engineering, Princeton University Spectral methods based on eigen-decomposition = E [ M ] + M E [ M ] M approx. low-rank Methods based on

L-102.00 1 Building Outline Map L-102.00 SCALE: 1&quot; = 10' Date: 7/17/2017 Date:

Spectral Clustering Spectral Clustering? Spectral methods Methods using eigenvectors of

Rosalind Dibley: Digital Stories and Me Sharing stories Information about how to tell stories is

Particle Hole Asymmetry in the Pseudogap Particle-Hole Asymmetry in the Pseudogap Phase of the

An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 An Introduction to Spectral

West 108th Street Development West Side Federation for Senior &amp; Supportive Housing Dattner

Spectral and High-Order Methods Spectral and High-Order Methods for Shock-Induced Mixing for

Spectral Graph Theory and its Applications Lillian Dai 6.454 Oct. 20, 2004 1 Outline Basic

Chapter 7. Sampling Chapter 7. Sampling methods? methods? Two types of sampling methods Two

Checki king in and Treating High-Achievi ving Students Meet Meet you your r Doctor Doctor

10Hz Spectral Lines Joschua Dilly 10Hz Spectral Lines 2 Introduction Ions 50cm Protons 30cm

AIRS In-flight Spectral Calibration Steve Gaiser 1 Steve Gaiser, AIRS in-orbit spectral

Lesson 9 Introduction Signal Spectral Analysis: Estimation of the power spectral density

Poster #190 1 Spectral Clustering of Signed Graphs Poster #190 Our Goal: Extend Spectral

Future Scenarios for the Waikato Scenarios Four stories about the future These are not

Edmund Coleman-Fountain Janice McLaughlin Stories and Practices Relationship between practices

Comparison of Surgical and Nonsurgical Options for Management of Nonspecific Chronic Low Back

Pr Preliminary comparison re results University # graduates of # graduates # filled # valid

Whats the Difference? Comparison of the Army MEC Risk Management Method and the MEC HA Method

Improving CG-CAHPS Scores in a Federally Qualified Health Center Presen ented B ed By Debra R

Comparison Inequalities and Fastest-Mixing Markov Chains ( Annals of Applied Probability , to

Web Site Design and Development JavaScript CS 0134 Fall 2018 Tues and Thurs 1:00 2:15PM

Internet Service Quality Gabor Molnar University of Colorado, Boulder Work with Scott Savage

Evaluating the effect of Network Neutrality on content provider business models Maurizio Naldi,

L-102.00 1 Building Outline Map L-102.00 SCALE: 1" = 10' Date: 7/17/2017 Date:

West 108th Street Development West Side Federation for Senior & Supportive Housing Dattner