Asymmetry Helps: Eigenvalue and Eigenvector Analyses of Asymmetrically Perturbed Low-Rank Matrices Yuxin Chen Electrical Engineering, Princeton University
Jianqing Fan Chen Cheng Princeton ORFE PKU Math
Eigenvalue / eigenvector estimation M ⋆ : truth • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n 3/ 21
Eigenvalue / eigenvector estimation + M ⋆ : truth H : noise • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n • Observed noisy data: M = M ⋆ + H 3/ 21
Eigenvalue / eigenvector estimation + M ⋆ : truth H : noise • A rank-1 matrix: M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ ∈ R n × n • Observed noisy data: M = M ⋆ + H • Goal: estimate eigenvalue λ ⋆ and eigenvector u ⋆ 3/ 21
Non-symmetric noise matrix + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix This may arise when, e.g., we have 2 samples for each entry of M ⋆ and arrange them in an asymmetric manner 4/ 21
A natural estimation strategy: SVD + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix • Use leading singular value λ svd of M to estimate λ ⋆ • Use leading left singular vector of M to estimate u ⋆ 5/ 21
A less popular strategy: eigen-decomposition + M = M ⋆ = λ ⋆ u ⋆ u ⋆ ⊤ H : asymmetric matrix • Use leading singular value λ svd eigenvalue λ eigs of M to estimate λ ⋆ • Use leading singular vector eigenvector of M to estimate u ⋆ 6/ 21
SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD eigen-decomposition > 7/ 21
SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD eigen-decomposition > • (Folklore?) Statistical accuracy SVD ≍ eigen-decomposition 7/ 21
SVD vs. eigen-decomposition For asymmetric matrices: • Numerical stability SVD eigen-decomposition > • (Folklore?) Statistical accuracy SVD ≍ eigen-decomposition Shall we always prefer SVD over eigen-decomposition? 7/ 21
A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ { H i,j } : i.i.d. N (0 , σ 2 ) , σ = 1 + H ; √ n log n � �� � M ⋆ 10 0 SVD � λ svd − λ ? � � � j 6 ! 6 ? j 10 -1 10 -2 200 400 600 800 1000 1200 1400 1600 1800 2000 n 8/ 21
A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ { H i,j } : i.i.d. N (0 , σ 2 ) , σ = 1 + H ; √ n log n � �� � M ⋆ 10 0 10 0 SVD Eigen-Decomposition SVD � λ svd − λ ? � � � j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 400 600 800 1000 1200 1400 1600 1800 2000 � n n n 8/ 21
A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ { H i,j } : i.i.d. N (0 , σ 2 ) , σ = 1 + H ; √ n log n � �� � M ⋆ 10 0 10 0 10 0 SVD Eigen-Decomposition Eigen-Decomposition SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 1600 1600 1800 1800 2000 2000 � n n n n 8/ 21
A curious numerical experiment: Gaussian noise M = u ⋆ u ⋆ ⊤ { H i,j } : i.i.d. N (0 , σ 2 ) , σ = 1 + H ; √ n log n � �� � M ⋆ 10 0 10 0 10 0 SVD Eigen-Decomposition Eigen-Decomposition SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 200 200 400 400 600 600 800 800 1000 1000 1200 1200 1400 1400 1600 1600 1800 1800 2000 2000 � n n n n � � λ eigs − λ ⋆ � � � λ svd − λ ⋆ � � ≈ 2 . 5 empirically, √ n � 8/ 21
j 6 ! 6 ? j n Another numerical experiment: matrix completion � 1 p M ⋆ with prob. p, M ⋆ = u ⋆ u ⋆ ⊤ ; p = 3 log n i,j M i,j = n 0 , else , � ? ? ? � ? ? ? ? ? � � � ? ? � ? ? ? ? � ? ? � � ? ? ? ? ? ? � ? ? � ? 9/ 21
Another numerical experiment: matrix completion � 1 p M ⋆ with prob. p, M ⋆ = u ⋆ u ⋆ ⊤ ; p = 3 log n i,j M i,j = n 0 , else , 10 0 Eigen-Decomposition SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j 10 -1 � λ svd − λ ? � 2 . 5 � p n � 10 -2 � λ eigs − λ ? � . 5 � 200 400 600 800 1000 1200 1400 1600 1800 2000 � n n � λ eigs − λ ⋆ � � λ svd − λ ⋆ � � � ≈ 2 . 5 � empirically, √ n � 9/ 21
Why does eigen-decomposition work so much better than SVD?
Problem setup M = u ⋆ u ⋆ ⊤ + H ∈ R n × n � �� � M ⋆ • H : noise matrix ◦ independent entries: { H i,j } are independent ◦ zero mean: E [ H i,j ] = 0 ◦ variance: Var ( H i,j ) ≤ σ 2 ◦ magnitudes: P {| H i,j | ≥ B } � n − 12 11/ 21
Problem setup M = u ⋆ u ⋆ ⊤ + H ∈ R n × n � �� � M ⋆ • H : noise matrix ◦ independent entries: { H i,j } are independent ◦ zero mean: E [ H i,j ] = 0 4 ◦ variance: Var ( H i,j ) ≤ σ 2 ◦ magnitudes: P {| H i,j | ≥ B } � n − 12 e i • M ⋆ obeys incoherence condition � µ k e > i U ? k 2 � i u ⋆ � � ≤ � e ⊤ max n 1 ≤ i ≤ n U ? 11/ 21
Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � λ eigs − λ ⋆ � � � ≤ � H � ( Bauer-Fike ) 12/ 21
Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � λ eigs − λ ⋆ � � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � λ svd − λ ⋆ � � � � � σ n log n + B log n � λ eigs − λ ⋆ � � � � � σ n log n + B log n 12/ 21
Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � λ eigs − λ ⋆ � � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � λ svd − λ ⋆ � � � � � σ ( reasonably tight if � H � is large ) n log n + B log n � λ eigs − λ ⋆ � � � � � σ n log n + B log n 12/ 21
Classical linear algebra results � λ svd − λ ⋆ � � � ≤ � H � ( Weyl ) � λ eigs − λ ⋆ � � � ≤ � H � ( Bauer-Fike ) ⇓ matrix Bernstein inequality � λ svd − λ ⋆ � � � � � σ ( reasonably tight if � H � is large ) n log n + B log n � λ eigs − λ ⋆ � � � � � σ n log n + B log n ( can be significantly improved ) 12/ 21
j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j n n n Main results: eigenvalue perturbation Theorem 1 (Chen, Cheng, Fan ’18) With high prob., leading eigenvalue λ eigs of M obeys � µ � λ eigs − λ ⋆ � � � � σ � � � n log n + B log n n 13/ 21
Main results: eigenvalue perturbation Theorem 1 (Chen, Cheng, Fan ’18) With high prob., leading eigenvalue λ eigs of M obeys � µ � λ eigs − λ ⋆ � � � � σ � � � n log n + B log n n 10 0 10 0 10 0 Eigen-Decomposition Eigen-Decomposition SVD SVD SVD � λ svd − λ ? � � Rescaled SVD Error � j 6 ! 6 ? j j 6 ! 6 ? j j 6 ! 6 ? j 10 -1 10 -1 10 -1 � λ svd − λ ? � � 2 . 5 p n � 10 -2 10 -2 10 -2 � λ eigs − λ ? � � . 5 200 200 200 400 400 400 600 600 600 800 800 800 1000 1000 1000 1200 1200 1200 1400 1400 1400 1600 1600 1600 1800 1800 1800 2000 2000 2000 � n n n n � n • Eigen-decomposition is µ times better than SVD! � � σ √ n log n + B log n � � λ svd − λ ⋆ � — recall 13/ 21
min fk u ! u ? k 1 ; k u + u ? k 1 g n Main results: entrywise eigenvector perturbation Theorem 2 (Chen, Cheng, Fan ’18) With high prob., leading eigenvector u of M obeys � µ � � u ± u ⋆ � � � σ � min ∞ � n log n + B log n � n 14/ 21
min fk u ! u ? k 1 ; k u + u ? k 1 g n Main results: entrywise eigenvector perturbation Theorem 2 (Chen, Cheng, Fan ’18) With high prob., leading eigenvector u of M obeys � µ � � � u ± u ⋆ � � σ � min ∞ � n log n + B log n � n � � λ ⋆ � • if � H � ≪ � , then � � u ± u ⋆ � � � u ⋆ � min 2 ≪ ( classical bound ) � � 2 14/ 21
min fk u ! u ? k 1 ; k u + u ? k 1 g n Main results: entrywise eigenvector perturbation Theorem 2 (Chen, Cheng, Fan ’18) With high prob., leading eigenvector u of M obeys � µ � � u ± u ⋆ � � � σ � min ∞ � n log n + B log n � n � � λ ⋆ � • if � H � ≪ � , then � u ± u ⋆ � � � � u ⋆ � min 2 ≪ ( classical bound ) � � 2 � � u ± u ⋆ � � � u ⋆ � min ∞ ≪ ( our bound ) � � ∞ 14/ 21
Recommend
More recommend