Sparse Separable Nonnegative Matrix Factorization Extending - PowerPoint PPT Presentation

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with ℓ 0 sparsity constraints Nicolas Nadisic, Arnaud Vandaele, Jeremy Cohen, Nicolas Gillis 9 October 2020 — GdR MIA Thematic Day Universit´ e de Mons, Belgium 1/33

Nonnegative Matrix Factorization Given a data matrix M ∈ R m × n and a rank r ≪ min( m , n ), find + W ∈ R m × r and H ∈ R r × n such that M ≈ WH . + + In optimization terms, standard NMF is equivalent to: W ≥ 0 , H ≥ 0 � M − WH � 2 min F 2/33

Nonnegative Matrix Factorization Why nonnegativity? • More interpretable factors (part-based representation) • Naturally favors sparsity • Makes sense in many applications (image processing, hyperspectral unmixing, text mining, . . . ) 3/33

NMF Geometry ( M ≈ WH ) Data points M (: , j ) 4/33

NMF Geometry ( M ≈ WH ) Data points M (: , j ) Vertices W (: , p ) 4/33

Application – hyperspectral unmixing � M (: , j ) ≈ W (: , p ) H ( p , j ) � �� p spectral signature of spectral signature of abundance of p-th material j-th pixel p-th material in j-th pixel Images from Bioucas Dias and Nicolas Gillis. 5/33

Application – hyperspectral unmixing Grass Pixels M (: , j ) Materials W (: , p ) Rooftop Trees 6/33

Starting point 1/2 – Separable NMF • NMF is NP-hard [Vavasis, 2010]. • Under the separability assumption, it’s solvable in polynomial time [Arora et al., 2012]. 7/33

Starting point 1/2 – Separable NMF Separability: • The vertices are selected among the data points • In hyperspectral unmixing, equivalent to Pure-pixel assumption M = WH Standard NMF model Separable NMF M = M (: , J ) H 8/33

Separable NMF – Geometry 1 Data points M (: , j ) Selected vertices W (: , j ) 0 . 8 Unit simplex 0 . 6 0 . 4 0 . 2 0 0 0 0 . 2 0 . 2 0 . 4 0 . 4 0 . 6 0 . 6 0 . 8 0 . 8 9/33 1 1

Algorithm for Separable NMF – SNPA SNPA = Successive Nonnegative Projection Algorithm [Gillis, 2014] • Start with empty W , and residual R = M • Alternate between • Greedy selection of one column of R to be added to W • Projection of R on the convex hull of the origin and columns of W • Stop when reconstruction error = 0 (or < ǫ ) 10/33

SNPA 1 0 . 8 0 . 6 0 . 4 0 . 2 0 0 0 0 . 2 0 . 2 0 . 4 0 . 4 0 . 6 0 . 6 0 . 8 0 . 8 11/33 1 1

Limitations of Separable NMF What if one column of W is a combination of others columns of W ? → Interior vertex SNPA cannot identify it, because it belongs to the convex hull of the other vertices. 12/33

Limitations of Separable NMF 1 Data points M (: , j ) 1 Exterior vertices Interior vertex 0 . 8 Unit simplex 5 0 . 6 0 . 4 4 3 0 . 2 2 0 0 0 0 . 2 0 . 2 0 . 4 0 . 4 0 . 6 0 . 6 0 . 8 0 . 8 13/33 1 1

Limitations of Separable NMF SNPA is unable to handle this case, the interior vertex is not identifiable. However, if columns of H are sparse (a data point is a combination of only k < r vertices), this interior vertex may be identifiable. 14/33

Starting point 2/2 — k-Sparse NMF M ≈ WH s.t. H is column-wise k -sparse (for all i , � H (: , i ) � 0 ≤ k ) • Motivation → better interpretability • Motivation → improve results using prior sparsity knowledge • Ex: a pixel expressed as a combination of at most k materials H = M W 15/33

k-Sparse NMF – Geometry 1 0 . 8 0 . 6 0 . 4 0 . 2 0 0 0 0 . 2 0 . 2 0 . 4 0 . 4 0 . 6 0 . 6 0 . 8 0 . 8 16/33 1 1

k-Sparse NMF � r � k-Sparse NMF is combinatorial, with possible combinations per k column of H . Previous work: a branch-and-bound algorithm for Exact k-Sparse NNLS [Nadisic et al., 2020]. root node, unconstrained k ′ ≤ n = 5 X = [ x 1 x 2 x 3 x 4 x 5 ] ... k ′ ≤ 4 X = [0 x 2 x 3 x 4 x 5 ] X = [ x 1 0 x 3 x 4 x 5 ] ... k ′ ≤ 3 X = [0 0 x 3 x 4 x 5 ] X = [0 x 2 0 x 4 x 5 ] X = [0 x 2 x 3 0 x 5 ] X = [0 0 x 3 x 4 0] k ′ ≤ 2 = k → stop X = [0 0 0 x 4 x 5 ] X = [0 0 x 3 0 x 5 ] 17/33

Sparse Separable NMF M = WH Standard NMF model M = M (: , J ) H Separable NMF M = M (: , J ) H s.t. for all i , � H (: , i ) � 0 ≤ k SSNMF 18/33

Our approach for SSNMF Replace the projection step of SNPA, from projection on convex hull to projection on k -sparse hull, done with our BnB solver ⇒ kSSNPA. kSSNPA • Identifies all interior vertices (non-selected points are never vertices) • May also identify wrong vertices (explanation to come!) ⇒ kSSNPA can be seen as a screening technique to reduce the number of points to check. 19/33

Our approach for SSNMF In a nutshell, 3 steps: 1. Identify exterior vertices with SNPA 2. Identify candidate interior vertices with kSSNPA 3. Discard bad candidates, those that are k -sparse combinations of other selected points (they cannot be vertices) Our algorithm: BRASSENS Relies on Assumptions of Sparsity and Separability for Elegant NMF Solving. 20/33

BRASSENS with sparsity k = 2 1 0 . 8 0 . 6 0 . 4 0 . 2 0 0 0 0 . 2 0 . 2 0 . 4 0 . 4 0 . 6 0 . 6 0 . 8 0 . 8 21/33 1 1

Complexity • As opposed to Sep NMF, SSNMF is NP-hard (Arnaud proved it, see the paper) • Hardness comes from the k -sparse projection • Not too bad when r is small, with our BnB solver 22/33

Correctness Assumption 1 No column of W is a nonnegative linear combination of k other columns of W. ⇒ necessary condition for recovery by BRASSENS Assumption 2 No column of W is a nonnegative linear combination of k other columns of M. ⇒ sufficient condition for recovery by BRASSENS If data points are k -sparse and generated at random, Assumption 2 is true with probability one. 23/33

Related work Only one similar work: [Sun and Xin, 2011] • Handles only one interior vertex • Non-optimal bruteforce-like method 24/33

Experiments • Experiments on synthetic datasets with interior vertices • Experiment on underdetermined multispectral unmixing (Urban image, 309 × 309 pixels, limited to m = 3 spectral bands, and we search for r = 5 materials) • No other algorithm can tackle SSNMF, so comparisons are limited 25/33

XP Synthetic: 3 exterior and 2 interior vertices, n grows 12 Number of candidate interior vertices 15 Run time (in seconds) 10 10 8 5 6 0 20 40 60 80 100 120 140 160 180 200 26/33 Number of data points n

XP Synthetic 2: dimensions grow m n r k Number of candidates Run time in seconds 3 25 5 2 5.5 0.26 4 30 6 3 8.5 3.30 5 35 7 4 9.5 38.71 6 40 8 5 13 395.88 Conclusion from experiments: • kSSNPA is efficient to select few candidates • Still, BRASSENS does not scale well :( 27/33

XP on 3-bands Urban dataset with r = 5 SNPA Grass+Trees Dirt+Road Rooftops 1 Rooftops 1 Dirt+Grass +Rooftops +Rooftops +Dirt+Road BRASSENS (finds 1 interior point) Grass+Trees Rooftops 1 Road Rooftops+Road Dirt+Grass 28/33

Future work • Theoretical analysis of robustness to noise • New real-life applications 29/33

Take-home messages Sparse Separable NMF: • Combine constraints of separability and k -sparsity • A new way to regularize NMF • Can handle some cases that Separable NMF cannot • Underdetermined case • Interior vertices • Is NP-hard (unlike Sep NMF), but actually “not so hard” for small r • Is provably solved by our approach • Does not scale well 30/33

References i Arora, S., Ge, R., Kannan, R., and Moitra, A. (2012). Computing a nonnegative matrix factorization – provably. STOC ’12. Gillis, N. (2014). Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation. SIAM Journal on Imaging Sciences , 7(2):1420–1450. Nadisic, N., Vandaele, A., Gillis, N., and Cohen, J. E. (2020). Exact Sparse Nonnegative Least Squares. In ICASSP 2020 , pages 5395 – 5399. 31/33

References ii Sun, Y. and Xin, J. (2011). Underdetermined Sparse Blind Source Separation of Nonnegative and Partially Overlapped Data. SIAM Journal on Scientific Computing , 33(4):2063–2094. Vavasis, S. A. (2010). On the Complexity of Nonnegative Matrix Factorization. SIAM Journal on Optimization . 32/33

Sparse Separable Nonnegative Matrix Factorization Extending - PowerPoint PPT Presentation

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with 0 sparsity constraints Nicolas Nadisic, Arnaud Vandaele, Jeremy Cohen, Nicolas Gillis 9 October 2020 GdR MIA Thematic Day Universit e de Mons, Belgium

Nonnegative matrix factorization and applications in audio signal processing C edric F

Structured sparse methods for matrix factorization Francis Bach Sierra team, INRIA - Ecole

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

Data Sciences CentraleSupelec Advance Machine Learning Course VI - Nonnegative matrix

Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Parallel Nonnegative Matrix Factorization Algorithms for Hyperspectral Images A Masters Thesis

Automatic relevance determination in nonnegative matrix factorization with the -divergence

Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique

Nonnegative Matrix Factorization and Applications Christine De Mol (joint work with Michel

Adversarial Nonnegative Matrix Factorization Lei Luo, Yanfu Zhang, Heng Huang Electrical and

Some Recent Advances in Nonnegative Matrix Factorization and their Applications to Hyperspectral

Multi-View Clustering via Joint Nonnegative Matrix Factorization Jialu Liu 1 Chi Wang 1 Jing Gao 2

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling Jamie Haddock

Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.1 Direct Methods

Iterative methods for Image Processing Lothar Reichel Como, May 2018. Lecture 3: Block iterative

Learning with Low Rank Approximations or how to use near separability to extract content from

The interplay of analysis and algorithms (or, Computational Harmonic Analysis) Anna Gilbert

Robust nonnegative matrix factorisation with the -divergence and applications in imaging C

Outline Last time Image gradients Seam carving gradients as energy Edges

? 0 1 0 0 0 0 0 0 0 - 1 1 1 = ? * 0 2 0 1 1 1 Original 0 0 0 1 1 1

4G/5G IoT: Some Keys & Obstacles to Achieving Performance IEEE 5G Summit Reston, VA August

Advances in I TER relevant NbTi and Nb Sn strands and low-loss NbTi Nb 3 Sn strands and low-loss

Sparse Separable Nonnegative Matrix Factorization Extending - PowerPoint PPT Presentation

Sparse Separable Nonnegative Matrix Factorization Extending Separable NMF with 0 sparsity constraints Nicolas Nadisic, Arnaud Vandaele, Jeremy Cohen, Nicolas Gillis 9 October 2020 GdR MIA Thematic Day Universit e de Mons, Belgium

Nonnegative matrix factorization and applications in audio signal processing C edric F

Structured sparse methods for matrix factorization Francis Bach Sierra team, INRIA - Ecole

L101: Matrix Factorization In a nutshell Matrix factorization/completion you know? In NLP?

Data Sciences CentraleSupelec Advance Machine Learning Course VI - Nonnegative matrix

Online-Updating Regularized Kernel Matrix Factorization Models for Large-Scale Recommender

Tensor Factorization via Matrix Factorization Volodymyr Kuleshov Arun Tejasvi Chaganty Percy

Parallel Nonnegative Matrix Factorization Algorithms for Hyperspectral Images A Masters Thesis

Automatic relevance determination in nonnegative matrix factorization with the -divergence

Automated Gene Classification using Nonnegative Matrix Factorization on Biomedical Literature

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique

Nonnegative Matrix Factorization and Applications Christine De Mol (joint work with Michel

Adversarial Nonnegative Matrix Factorization Lei Luo, Yanfu Zhang, Heng Huang Electrical and

Some Recent Advances in Nonnegative Matrix Factorization and their Applications to Hyperspectral

Multi-View Clustering via Joint Nonnegative Matrix Factorization Jialu Liu 1 Chi Wang 1 Jing Gao 2

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling Jamie Haddock

Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.1 Direct Methods

Iterative methods for Image Processing Lothar Reichel Como, May 2018. Lecture 3: Block iterative

Learning with Low Rank Approximations or how to use near separability to extract content from

The interplay of analysis and algorithms (or, Computational Harmonic Analysis) Anna Gilbert

Robust nonnegative matrix factorisation with the -divergence and applications in imaging C

Outline Last time Image gradients Seam carving gradients as energy Edges

? 0 1 0 0 0 0 0 0 0 - 1 1 1 = ? * 0 2 0 1 1 1 Original 0 0 0 1 1 1

4G/5G IoT: Some Keys &amp; Obstacles to Achieving Performance IEEE 5G Summit Reston, VA August

Advances in I TER relevant NbTi and Nb Sn strands and low-loss NbTi Nb 3 Sn strands and low-loss

4G/5G IoT: Some Keys & Obstacles to Achieving Performance IEEE 5G Summit Reston, VA August