online robust matrix factorization for dependent data
play

Online robust matrix factorization for dependent data streams - PowerPoint PPT Presentation

Online robust matrix factorization for dependent data streams Hanbaek Lyu Department of Mathematics, University of California, Los Angeles Seminar on applied math and data Science, HKUST Joint work with HanQin Cai and Deanna Needell Mar. 24,


  1. Online robust matrix factorization for dependent data streams Hanbaek Lyu Department of Mathematics, University of California, Los Angeles Seminar on applied math and data Science, HKUST Joint work with HanQin Cai and Deanna Needell Mar. 24, 2019 Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  2. Overview Introduction 1 ORMF algorithm and convergence result 2 Applications: Dictionary learning from networks 3 Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  3. Introduction 1. Introduction Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  4. Introduction Learning parts of images – Image reconstruction Reconstructed image 11 by11 Dictionary learned Original image – Cycle Dic�onary learned from Original image - Reconstructed image usinglearned dictionary from Cycleby M.C. Escher (1938) by M.C. Escher Cycle by M. C. Escher Cycle by M. C. Escher (1928) using learned dic�onary (basis) ◮ Dictionary learning enables a compressed representation of complex objects using a few dictionary elements. ◮ Used in data compression, reconstruction, transfer learning, etc. Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  5. Introduction Learning parts of images – Image reconstruction Reconstructed image 11 by11 Dictionary learned Original image – Cycle Dic�onary learned from Original image - Reconstructed image usinglearned dictionary from Cycleby M.C. Escher (1938) by M.C. Escher Cycle by M. C. Escher Cycle by M. C. Escher (1928) using learned dic�onary (basis) ◮ Dictionary learning enables a compressed representation of complex objects using a few dictionary elements. ◮ Used in data compression, reconstruction, transfer learning, etc. Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  6. Introduction Learning parts of images – Image reconstruction Reconstructed image 11 by11 Dictionary learned Original image – Cycle Dic�onary learned from Original image - Reconstructed image from Cycleby M.C. Escher (1938) by M.C. Escher usinglearned dictionary Cycle by M. C. Escher Cycle by M. C. Escher (1928) using learned dic�onary (basis) ◮ Dictionary learning enables a compressed representation of complex objects using a few dictionary elements. ◮ Used in data compression, reconstruction, transfer learning, etc. ◮ Img recons. = (local approx. by dict.) + (Averaging) Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  7. Introduction Simultaneous dictionary learning and outlier detection Corrupted image Reconstructed image Detected outlier Dictionary learned by ORNMF ◮ What defines an outlier? How can we detect them? ◮ Low-rank based approach – Outlier = Data - Low-rank approx. ◮ Dictionary-based approach – Outlier = Data - Reconstruction from dictionary ◮ Dictionary learning has to be done in a robust way Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  8. Introduction ◮ Matrix Factorization is a fundamental tool in dictionary learning problems. 𝑜 𝑠 𝑜 × 𝑠 𝐼 𝑒 𝑌 ≅ 𝑒 𝑋 Dictionary Data Code (rank-r basis) 441 10000 ≈ NMF Sample patches × Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams Dictionary Code 1 2 3 4 1 2 3 1 0 1 0 0 2 1 0 1 1 3 0 1 0 1 4 4 0 1 1 0 Graph Matrix Pixel picture

  9. Introduction ◮ Matrix Factorization is a fundamental tool in dictionary learning problems. 𝑜 𝑠 𝑜 × 𝑠 𝐼 𝑒 𝑌 ≅ 𝑒 𝑋 Dictionary Data Code (rank-r basis) ◮ Formulated as an optimization problem: 441 10000 � X − WH � + λ 1 � H � 1 (Reconstruction error) minimize W ∈ C , H ∈ C ′ subject to ( Constraints ) ≈ NMF Sample patches × Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams Dictionary Code 1 2 3 4 1 2 3 1 0 1 0 0 2 1 0 1 1 3 0 1 0 1 4 4 0 1 1 0 Graph Matrix Pixel picture

  10. Introduction ◮ Matrix Factorization is a fundamental tool in dictionary learning problems. 𝑜 𝑠 𝑜 × 𝑠 𝐼 𝑒 𝑌 ≅ 𝑒 𝑋 Dictionary Data Code (rank-r basis) ◮ Formulated as an optimization problem: 441 10000 � X − WH � + λ 1 � H � 1 (Reconstruction error) minimize W ∈ C , H ∈ C ′ subject to ( Constraints ) ≈ NMF Sample patches ◮ Non-convex optimization problem → No guarantee for global convergence × Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams Dictionary Code 1 2 3 4 1 2 3 1 0 1 0 0 2 1 0 1 1 3 0 1 0 1 4 4 0 1 1 0 Graph Matrix Pixel picture

  11. Introduction ◮ Robust Matrix Factorization enables simultaneous dictionary learning and outlier detection 𝑜 𝑠 𝑜 𝑜 × 𝐼 𝑠 + ≅ 𝑒 𝑌 𝑒 𝑋 𝑇 𝑒 Data Dictionary Code Outlier Original Image Reconstructed Image Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  12. Introduction ◮ Robust Matrix Factorization enables simultaneous dictionary learning and outlier detection 𝑜 𝑠 𝑜 𝑜 × 𝐼 𝑠 + ≅ 𝑒 𝑌 𝑒 𝑋 𝑇 𝑒 Data Dictionary Code Outlier ◮ Formulated as an optimization problem: Original Image Reconstructed Image � X − WH − S � + λ 1 � H � 1 + λ 2 � S � 1 (Reconstruction error) minimize W ∈ C , H ∈ C ′ subject to ( Constraints ) Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  13. Introduction ◮ Robust Matrix Factorization enables simultaneous dictionary learning and outlier detection 𝑜 𝑠 𝑜 𝑜 × 𝐼 𝑠 + ≅ 𝑒 𝑌 𝑒 𝑋 𝑇 𝑒 Data Dictionary Code Outlier ◮ Formulated as an optimization problem: Original Image Reconstructed Image � X − WH − S � + λ 1 � H � 1 + λ 2 � S � 1 (Reconstruction error) minimize W ∈ C , H ∈ C ′ subject to ( Constraints ) ◮ Non-convex optimization problem → No guarantee for global convergence Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  14. Introduction Matrix Factorization - other examples ◮ Singular Value Decomposition (SVD): W ∈ R d × r , H ∈ R r × n � X − WH � F minimize ◮ Non-negative Matrix Factorization (NMF): � X − WH � F minimize W ∈ R d × r ≥ 0 , H ∈ R r × n ≥ 0 - Corresponding dictionary columns can be interpreted as ‘parts’ of the data matrix (Lee, Seung ’99 [ lee1999learning ]) ◮ Subspace Clustering (may have r > d ): W ∈ R d × r , H group sparse � X − WH � F minimize Matrix Completion, Probabilistic PCA, Sparse PCA, Robust PCA, Poisson PCA, Heteroscedastic PCA, Bilinear Inverse Problems, Robust NMF, Max-Plus Factorization ... Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  15. (rank-r basis) 𝑜 𝑜 𝑠 𝐼 × 𝑠 Introduction 𝑒 𝑌 𝑒 ≅ 𝑋 Illustration of RMF application to images Dictionary Data Code # of sq. patches sampled 𝑙 � 𝑙 ≈ RNMF 𝑙 Sample sq. patches × Dictionary Outlier Code 1 2 3 4 1 2 3 1 0 1 0 0 Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams 2 1 0 1 1 3 0 1 0 1 4 4 0 1 1 0 Graph Matrix Pixel picture

  16. 𝑍 𝑍 𝑍 𝑍 Underlying information � � � � ⋯ Observed data 𝑌 � 𝑌 � 𝑌 � 𝑌 � 𝑋 𝑋 𝑋 ⋯ Dictionary � � � 𝑋 � (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (Code, Noise) Introduction Online RMF ◮ Data matrix could be too large to be loaded in a memory or processed at once Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  17. 𝑍 𝑍 𝑍 𝑍 Underlying information � � � � ⋯ Observed data 𝑌 � 𝑌 � 𝑌 � 𝑌 � 𝑋 𝑋 𝑋 ⋯ Dictionary � � � 𝑋 � (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (Code, Noise) Introduction Online RMF ◮ Data matrix could be too large to be loaded in a memory or processed at once ◮ Only sub-matrices of a huge data set may be available through sampling Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

  18. 𝑍 𝑍 𝑍 𝑍 Underlying information � � � � ⋯ Observed data 𝑌 � 𝑌 � 𝑌 � 𝑌 � 𝑋 𝑋 𝑋 ⋯ Dictionary � � � 𝑋 � (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (𝐼 � , 𝑇 � ) (Code, Noise) Introduction Online RMF ◮ Data matrix could be too large to be loaded in a memory or processed at once ◮ Only sub-matrices of a huge data set may be available through sampling ◮ We may want to learn from a complicated probability distribution on the sample space of data – e.g., posterior distribution Hanbaek Lyu (UCLA) Online robust matrix factorization for dependent data streams

Recommend


More recommend