two added structures in sparse recovery nonnegativity and
play

Two added structures in sparse recovery: nonnegativity and - PowerPoint PPT Presentation

Two added structures in sparse recovery: nonnegativity and disjointedness Simon Foucart University of Georgia Semester Program on High-Dimensional Approximation ICERM 7 October 2014 Part I: Nonnegative Sparse Recovery (joint work with


  1. Two added structures in sparse recovery: nonnegativity and disjointedness Simon Foucart University of Georgia Semester Program on “High-Dimensional Approximation” ICERM 7 October 2014

  2. Part I: Nonnegative Sparse Recovery (joint work with D. Koslicki)

  3. Motivation from Metagenomics

  4. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample.

  5. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic.

  6. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1.

  7. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads)

  8. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria.

  9. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is,

  10. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 .

  11. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 . ◮ Quikr improves on traditional read-by-read methods, especially in terms of speed.

  12. Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 . ◮ Quikr improves on traditional read-by-read methods, especially in terms of speed. ◮ Codes available at sourceforge.net/projects/quikr/ sourceforge.net/projects/wgsquikr/

  13. Exact Measurements

  14. Exact Measurements Let x ∈ R N be a nonnegative vector with support S .

  15. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . �

  16. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 .

  17. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible .

  18. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP).

  19. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP). If 1 ∈ im ( A ⊤ ) (e.g. if A is a frequency matrix), then (NNBP) ⇒ (F) ⇒ (BP).

  20. Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP). If 1 ∈ im ( A ⊤ ) (e.g. if A is a frequency matrix), then (NNBP) ⇒ (F) ⇒ (BP). Morale: ℓ 1 -minimization not suited for nonnegative sparse recovery.

  21. Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N

  22. Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74].

  23. Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin ,

  24. Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin , and inner loop to make sure that x n +1 ≥ 0 .

  25. Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin , and inner loop to make sure that x n +1 ≥ 0 . ◮ Connection with OMP explains suitability for sparse recovery.

  26. Inaccurate Measurements

  27. Inaccurate Measurements ◮ When y = Ax + e with e � = 0 , a classical strategy consists in solving the ℓ 1 -regularization � z � 1 + ν � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N

  28. Inaccurate Measurements ◮ When y = Ax + e with e � = 0 , a classical strategy consists in solving the ℓ 1 -regularization � z � 1 + ν � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ We prefer the ℓ 1 -squared regularization � z � 2 1 + λ 2 � y − Az � 2 subject to z ≥ 0 , minimize 2 z ∈ R N

Recommend


More recommend