new variants of nonnegative matrix factorization for
play

New variants of Nonnegative Matrix Factorization for sparsity - PowerPoint PPT Presentation

New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique finding Nicolas Gillis nicolas.gillis@uclouvain.be In collaboration with Fran cois Glineur UCL/CORE (Center for Operations Research and


  1. New variants of Nonnegative Matrix Factorization for sparsity improvement and maximum biclique finding Nicolas Gillis nicolas.gillis@uclouvain.be In collaboration with Fran¸ cois Glineur UCL/CORE (Center for Operations Research and Econometrics) UCL/INMA (Department of Mathematical Engineering) March 3, 2009 Seminar at CESAME CESAME Nonnegative Matrix Factorization 1

  2. Outline 1. Introduction to Nonnegative Matrix Factorization ◮ Motivations and applications ◮ Some algorithms 2. Rank-one update and Nonnegative Factorization ◮ Nonnegative Factorization ◮ Complexity and the maximum edge biclique problem 3. Greedy with Underapproximations ◮ For sparse approximations ◮ Algorithm based on Lagrangian relaxation CESAME Nonnegative Matrix Factorization 2

  3. Why low-rank matrix approximations ? Given a matrix M ∈ R m × n and a factorization rank r , we would like to find U ∈ R m × r and V ∈ R r × n such that M ≈ UV M is approximated by a rank r matrix. − → dimensionality reduction for noise filtering, compression, interpretation, classification, . . . CESAME Nonnegative Matrix Factorization 3

  4. Why low-rank matrix approximations ? Given a matrix M ∈ R m × n and a factorization rank r , we would like to find U ∈ R m × r and V ∈ R r × n such that M ≈ UV M is approximated by a rank r matrix. − → dimensionality reduction for noise filtering, compression, interpretation, classification, . . . CESAME Nonnegative Matrix Factorization 3

  5. Matrix approximation and optimization If we want to minimize the sum of squares of the error i.e. � || M − UV || 2 ( M − UV ) 2 min F = ij , U,V ij the matrix factorization problem is an unconstrained optimization problem. This is a well-known problem with nice properties and which can be solved efficiently. It corresponds to finding the principal components of your data matrix (PCA). This can be solved using truncation of the singular value decomposition (SVD). CESAME Nonnegative Matrix Factorization 4

  6. Matrix approximation and optimization If we want to minimize the sum of squares of the error i.e. � || M − UV || 2 ( M − UV ) 2 min F = ij , U,V ij the matrix factorization problem is an unconstrained optimization problem. This is a well-known problem with nice properties and which can be solved efficiently. It corresponds to finding the principal components of your data matrix (PCA). This can be solved using truncation of the singular value decomposition (SVD). CESAME Nonnegative Matrix Factorization 4

  7. Matrix factorization, a linear model If each column of M is an element of a dataset, r � M : j U : k V kj ≈ ���� ���� ���� k =1 basis elements elements of the data weights the columns of M are decomposed into a linear combination of the columns of U which then form a basis of these elements. Example     2 3 2 1 . 5 − 0 . 8 � � 1 2 . 3 0 . 6 M = 2 1 1 ≈ 0 . 7 − 0 . 9 = UV     − 1 0 . 7 − 1 1 5 0 1 . 9 1   2 . 3 2 . 9 1 . 7 = 1 . 6 1 1 . 3   0 . 9 5 . 1 0 . 1 CESAME Nonnegative Matrix Factorization 5

  8. Matrix factorization, a linear model If each column of M is an element of a dataset, r � M : j U : k V kj ≈ ���� ���� ���� k =1 basis elements elements of the data weights the columns of M are decomposed into a linear combination of the columns of U which then form a basis of these elements. Example     2 3 2 1 . 5 − 0 . 8 � � 1 2 . 3 0 . 6 M = 2 1 1 ≈ 0 . 7 − 0 . 9 = UV     − 1 0 . 7 − 1 1 5 0 1 . 9 1   2 . 3 2 . 9 1 . 7 = 1 . 6 1 1 . 3   0 . 9 5 . 1 0 . 1 CESAME Nonnegative Matrix Factorization 5

  9. Nonnegativity In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well. CESAME Nonnegative Matrix Factorization 6

  10. Nonnegativity In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well. CESAME Nonnegative Matrix Factorization 6

  11. Nonnegativity In many applications, data are nonnegative, often due to physical considerations, e.g. ⋄ images are described by pixel intensities; ⋄ texts are represented by vectors of word counts; ⋄ spectra correspond to power intensities. For interpretation purposes, one can think of imposing nonnegativity constraints on the factor U so that basis elements belong to the same space as the original data. Moreover, in order to force the reconstruction of the basis elements to be additive, one can impose the weights V to be nonnegative as well. CESAME Nonnegative Matrix Factorization 6

  12. Image Processing Each column of M represents a face using pixel intensity → M is a nonnegative matrix CESAME Nonnegative Matrix Factorization 7

  13. Image Processing For an unconstrained decomposition Figure: Gray: positive entries; Red: negatives entries Basis elements are not nonnegative and can not be interpreted easily as facial features. CESAME Nonnegative Matrix Factorization 8

  14. Image Processing U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips. CESAME Nonnegative Matrix Factorization 9

  15. Image Processing U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips. CESAME Nonnegative Matrix Factorization 9

  16. Image Processing U ≥ 0 constraints the basis elements to be nonnegative. Moreover V ≥ 0 imposes an additive reconstruction. The basis elements extract facial features such as eyes, nose and lips. CESAME Nonnegative Matrix Factorization 9

  17. Image Processing NMF allows a part-based representation of the data. CESAME Nonnegative Matrix Factorization 10

  18. Text Mining M ( i, j ) is the frequency of word i in text j i.e. the columns of M represents the words frequency in each text. CESAME Nonnegative Matrix Factorization 11

  19. Text Mining ⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes. CESAME Nonnegative Matrix Factorization 12

  20. Text Mining ⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes. CESAME Nonnegative Matrix Factorization 12

  21. Text Mining ⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes. CESAME Nonnegative Matrix Factorization 12

  22. Text Mining ⋄ Basis elements allow to recover the different topics; ⋄ Weights allow to assign each text to its corresponding classes. CESAME Nonnegative Matrix Factorization 12

  23. Text Mining ⋄ Basis elements allow to recover the different topics: ◮ Basis element 1 : profit, company, bank, . . . → Economy ◮ Basis element 2 : run, jump, score, . . . → Sport ⋄ Weights allow to assign each text to its corresponding class. CESAME Nonnegative Matrix Factorization 13

  24. Text Mining ⋄ Basis elements allow to recover the different topics: ◮ Basis element 1 : profit, company, bank, . . . → Economy ◮ Basis element 2 : run, jump, score, . . . → Sport ⋄ Weights allow to assign each text to its corresponding class. CESAME Nonnegative Matrix Factorization 13

  25. Text Mining ⋄ Basis elements allow to recover the different topics: ◮ Basis element 1 : profit, company, bank, . . . → Economy ◮ Basis element 2 : run, jump, score, . . . → Sport ⋄ Weights allow to assign each text to its corresponding class. CESAME Nonnegative Matrix Factorization 13

  26. Text Mining ⋄ Basis elements allow to recover the different topics: ◮ Basis element 1 : profit, company, bank, . . . → Economy ◮ Basis element 2 : run, jump, score, . . . → Sport ⋄ Weights allow to assign each text to its corresponding class. CESAME Nonnegative Matrix Factorization 13

  27. Text Mining ⋄ Basis elements allow to recover the different topics: ◮ Basis element 1 : profit, company, bank, . . . → Economy ◮ Basis element 2 : run, jump, score, . . . → Sport ⋄ Weights allow to assign each text to its corresponding class. CESAME Nonnegative Matrix Factorization 13

  28. Spectral Data Analysis More than 15000 various type of objects in orbit (military/commercial satellites, debris, . . . ). Need for space object database mining, object identification, clustering, classification, . . . CESAME Nonnegative Matrix Factorization 14

Recommend


More recommend