phase transitions in low rank matrix estimation
play

Phase transitions in low-rank matrix estimation May 11, 2017 Marc - PowerPoint PPT Presentation

Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L eo Miolane INRIA, ENS 1 / 14 Introduction The statistical model Spiked Wigner model n XX Y + Z =


  1. Phase transitions in low-rank matrix estimation May 11, 2017 Marc Lelarge & L´ eo Miolane INRIA, ENS 1 / 14

  2. Introduction The statistical model “Spiked Wigner” model � � � λ � � n XX ⊺ Y + Z = ���� � �� � ���� noise observations signal i.i.d. ◮ X : vector of dimension n with entries X i ∼ P 0 . E X 1 = 0 , E X 2 1 = 1 . i.i.d. ◮ Z i,j = Z j,i ∼ N (0 , 1) . ◮ λ : signal-to-noise ratio. Goal: recover the low-rank matrix XX ⊺ from Y . 2 / 14

  3. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . 3 / 14

  4. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . Spectral density of the signal Limiting spectral density of the noise 3 / 14

  5. Principal component analysis (PCA) B.B.P. phase transition √ ◮ The matrix Y / √ n = λ XX ⊺ /n + Z / √ n is a perturbed low-rank matrix. ◮ Estimate X using the eigenvector ˆ x n associated with the largest eigenvalue µ n of Y / √ n . B.B.P. phase transition � µ n − → 2 ◮ if λ ≤ 1 X · ˆ − → 0 x n √ � 1 µ n − → λ + λ > 2 √ ◮ if λ > 1 � | X · ˆ x n | − → 1 − 1 /λ > 0 Baik et al., 2005; Benaych-Georges and Nadakuditi, 2011 3 / 14

  6. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? 4 / 14

  7. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? ◮ When λ > 1 , is PCA optimal? 4 / 14

  8. Questions ◮ PCA fails when λ ≤ 1 , but is it still possible to recover the signal? ◮ When λ > 1 , is PCA optimal? ◮ More generally, what is the best achievable estimation performance in both regimes? 4 / 14

  9. MMSE and information-theoretic threshold Goal 1 � � � XX ⊺ − ˆ 2 � � MMSE n = min n 2 E θ ( Y ) � ˆ θ = 1 ( X i X j − E [ X i X j | Y ]) 2 ≤ � E [ X 2 ] 2 n 2 � �� � 1 ≤ i,j ≤ n Dummy MSE 5 / 14

  10. MMSE and information-theoretic threshold Goal 1 � � � XX ⊺ − ˆ 2 � � MMSE n = min n 2 E θ ( Y ) � ˆ θ = 1 ( X i X j − E [ X i X j | Y ]) 2 ≤ � E [ X 2 ] 2 n 2 � �� � 1 ≤ i,j ≤ n Dummy MSE Information-theoretic threshold 1. Compute lim n →∞ MMSE n 2. Deduce the information-theoretic threshold, i.e. the critical value λ c such that ◮ if λ > λ c , n →∞ MMSE n < Dummy MSE lim ◮ if λ < λ c , n →∞ MMSE n = Dummy MSE lim 5 / 14

  11. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = 6 / 14

  12. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = ◮ Study the posterior P ( x | Y ) = 1 P 0 ( x ) exp( H n ( x )) where Z n � λ nY i,j x i x j − λ � 2 nx 2 i x 2 H n ( x ) = j i<j � λ + λ nX i X j x i x j − λ � 2 nx 2 i x 2 = nZ i,j x i x j j i<j � �� � � �� � planted solution SK 6 / 14

  13. Connection with statistical physics A planted spin glass model � λ n XX ⊺ + Z ◮ Compute the MMSE for Y = ◮ Study the posterior P ( x | Y ) = 1 P 0 ( x ) exp( H n ( x )) where Z n � nY i,j x i x j − λ λ � 2 nx 2 i x 2 H n ( x ) = j i<j � λ + λ nX i X j x i x j − λ � 2 nx 2 i x 2 = nZ i,j x i x j j i<j � �� � � �� � planted solution SK ◮ Compute the limit of the free energy F n = 1 n E log Z n because Constant − F n = 1 ∂λ nI ( X ; Y ) − − → MMSE 6 / 14

  14. Replica symmetric formula The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y 0 = √ γX 0 + Z 0 � � � √ γY 0 x 0 − γ 2 x 2 and the scalar free energy: F ( γ ) = E log P 0 ( x 0 ) e 0 x 0 7 / 14

  15. Replica symmetric formula The scalar channel Lesieur et al., 2015 conjectured that the problem is characterized par the scalar channel: Y 0 = √ γX 0 + Z 0 � � � √ γY 0 x 0 − γ 2 x 2 and the scalar free energy: F ( γ ) = E log P 0 ( x 0 ) e 0 x 0 Replica symmetric formula q ≥ 0 F ( λq ) − λ 4 q 2 F n − n →∞ sup − − → n →∞ E P 0 [ X 2 ] 2 − q ∗ ( λ ) 2 MMSE n − − − → Proved by Barbier et al., 2016, extended by Lelarge and Miolane, 2016. 7 / 14

  16. Some curves ◮ We will plot the MMSE and MSE PCA curves when P 0 is of the form � � P 0 ( (1 − p ) /p ) = p � P 0 ( − p/ (1 − p )) = 1 − p for some p ∈ (0 , 1) . ◮ One can show that the corresponding matrix estimation problem is, in some sense, equivalent to the community detection problem with 2 asymmetric communities. 8 / 14

  17. 1 . 0 0 . 8 0 . 6 MMSE MSE AMP MSE P CA 0 . 4 0 . 2 0 . 0 0 . 25 0 . 50 0 . 75 1 . 00 1 . 25 1 . 50 1 . 75 2 . 00 λ MMSE, MSE PCA and MSE AMP , asymmetric SBM: p = 0 . 05 . 9 / 14

  18. “Free energy lanscape”, p = 0 . 05 , λ = 0 . 63 . 10 / 14

  19. 1 . 2 EASY K-S 1 HARD 0 . 8 p ∗ λ c 0 . 6 λ λ sp 0 . 4 IMPOSSIBLE 0 . 2 0 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 p Phase diagram from Caltagirone et al., 2016 11 / 14

  20. Thank you for your attention. Any questions? 12 / 14

  21. References I ◮ Baik, Jinho, G´ erard Ben Arous, and Sandrine P´ ech´ e (2005). “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices”. In: Annals of Probability , pp. 1643–1697. ◮ Barbier, Jean et al. (2016). “Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula”. In: Advances in Neural Information Processing Systems , pp. 424–432. ◮ Benaych-Georges, Florent and Raj Rao Nadakuditi (2011). “The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices”. In: Advances in Mathematics 227.1, pp. 494–521. ◮ Caltagirone, Francesco, Marc Lelarge, and L´ eo Miolane (2016). “Recovering asymmetric communities in the stochastic block model”. In: arXiv preprint arXiv:1610.03680 . ◮ Lelarge, Marc and L´ eo Miolane (2016). “Fundamental limits of symmetric low-rank matrix estimation”. In: arXiv preprint arXiv:1611.03888 . 13 / 14

  22. References II ◮ Lesieur, Thibault, Florent Krzakala, and Lenka Zdeborov´ a (2015). “MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel”. In: 53rd Annual Allerton Conference on Communication, Control, and Computing, Allerton 2015, Allerton Park & Retreat Center, Monticello, IL, USA, September 29 - October 2, 2015 . IEEE, pp. 680–687. isbn : 978-1-5090-1824-6. doi : 10.1109/ALLERTON.2015.7447070 . url : http://dx.doi.org/10.1109/ALLERTON.2015.7447070 . 14 / 14

Recommend


More recommend