Two added structures in sparse recovery: nonnegativity and - PowerPoint PPT Presentation

Two added structures in sparse recovery: nonnegativity and disjointedness Simon Foucart University of Georgia Semester Program on “High-Dimensional Approximation” ICERM 7 October 2014

Part I: Nonnegative Sparse Recovery (joint work with D. Koslicki)

Motivation from Metagenomics

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample.

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic.

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1.

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads)

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria.

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is,

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 .

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 . ◮ Quikr improves on traditional read-by-read methods, especially in terms of speed.

Motivation from Metagenomics ◮ x ∈ R N ( N = 273 , 727): concentrations of known bacteria in a given environmental sample. Sparsity assumption is realistic. Note also that x ≥ 0 and � j x j = 1. ◮ y ∈ R m ( m = 4 6 = 4 , 096): frequencies of length-6 subwords (in 16S rRNA gene reads or in whole-genome shotgun reads) ◮ A ∈ R m × N : frequencies of length-6 subwords in all known (i.e., sequenced) bacteria. It is a frequency matrix, that is, � m A i , j ≥ 0 and i =1 A i , j = 1 . ◮ Quikr improves on traditional read-by-read methods, especially in terms of speed. ◮ Codes available at sourceforge.net/projects/quikr/ sourceforge.net/projects/wgsquikr/

Exact Measurements

Exact Measurements Let x ∈ R N be a nonnegative vector with support S .

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . �

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 .

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible .

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP).

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP). If 1 ∈ im ( A ⊤ ) (e.g. if A is a frequency matrix), then (NNBP) ⇒ (F) ⇒ (BP).

Exact Measurements Let x ∈ R N be a nonnegative vector with support S . ◮ x is the unique minimizer of � z � 1 s.to Az = y iff � � � � < � � � (BP) for all v ∈ ker A \ { 0 } , j ∈ S v j ℓ ∈ S | v ℓ | . � ◮ x is the unique minimizer of � z � 1 s.to Az = y and z ≥ 0 iff v S ≥ 0 ⇒ � N (NNBP) for all v ∈ ker A \ { 0 } , i =1 v i > 0 . ◮ x is the unique z ≥ 0 s.to Az = y iff (F) for all v ∈ ker A \ { 0 } , v S ≥ 0 is impossible . In general, (F) ⇒ (NNBP) and (BP) ⇒ (NNBP). If 1 ∈ im ( A ⊤ ) (e.g. if A is a frequency matrix), then (NNBP) ⇒ (F) ⇒ (BP). Morale: ℓ 1 -minimization not suited for nonnegative sparse recovery.

Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N

Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74].

Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin ,

Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin , and inner loop to make sure that x n +1 ≥ 0 .

Nonnegative Least Squares ◮ To solve the feasibility problem, one may consider � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ MATLAB’s lsqnonneg implements [Lawson–Hanson 74]. ◮ This algorithm iterates the scheme � � � � S n +1 = S n ∪ j n +1 = argmax j A ∗ ( y − Ax n ) , j � � y − Az � 2 , supp ( z ) ⊆ S n +1 � x n +1 = argmin , and inner loop to make sure that x n +1 ≥ 0 . ◮ Connection with OMP explains suitability for sparse recovery.

Inaccurate Measurements

Inaccurate Measurements ◮ When y = Ax + e with e � = 0 , a classical strategy consists in solving the ℓ 1 -regularization � z � 1 + ν � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N

Inaccurate Measurements ◮ When y = Ax + e with e � = 0 , a classical strategy consists in solving the ℓ 1 -regularization � z � 1 + ν � y − Az � 2 subject to z ≥ 0 . minimize 2 z ∈ R N ◮ We prefer the ℓ 1 -squared regularization � z � 2 1 + λ 2 � y − Az � 2 subject to z ≥ 0 , minimize 2 z ∈ R N

Two added structures in sparse recovery: nonnegativity and - PowerPoint PPT Presentation

Two added structures in sparse recovery: nonnegativity and disjointedness Simon Foucart University of Georgia Semester Program on High-Dimensional Approximation ICERM 7 October 2014 Part I: Nonnegative Sparse Recovery (joint work with

Stable and Efficient Representation Learning with Nonnegativity Constraints Tsung-Han Lin and

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Value Added Opportunities with Value Added Opportunities with Value Added Opportunities with

Sparse Recovery and Fourier Sampling Eric Price MIT Eric Price (MIT) Sparse Recovery and

Bounds on Sparse Recovery with Additional Structures Abbas Kazemipour University of Maryland.

Adaptive Sparse Recovery Eric Price MIT 2012-04-26 Joint work with Piotr Indyk and David

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

Engineered Innovation . Weiler Facility Started with 40,000 sq. ft Added 80,000 sq. ft. in

Tutorial: Sparse Recovery Using Sparse Matrices Piotr Indyk MIT Problem Formulation

Strip Recovery: Strip Recovery: Strip Recovery: Strip Recovery: A 12 A 12- -Step

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.1 Direct Methods

Extremal results for sparse pseudorandom graphs Yufei Zhao Massachusetts Institute of Technology

Sparse tensors are a natural way of representing real-world data 1 Sparse tensors are a natural

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

CNBC Matlab Mini-Course Sparse Matrices Sparse matrices provide an efficient means to store

Jet Physics Kenichi Hatakeyama Baylor University CTEQ - MCnet Summer School

Intro to Analysis of Algorithms Computational Foundations Chapter 8 Michael Soltys CSU Channel

Chapter 4: Implementing High Availability and Redundancy in a Campus Network CCNP-RS SWITCH

Acoplamientos anmalos del quark top: la preparacin terica para los datos J. A.

Parallel Adaptations to High Temperatures in the Archean Eon Samuel Blanquart a 1 Bastien Boussau

RNA Structure modeling GDR MASIM Paris, 16-17 th November 2017 Bruno Sargueil, CNRS UMR 8015

MOL2NET, 2017 , 3, doi:10.3390/mol2net-03-xxxx 2 collected water samples for DNA extraction.

Alignment for Morphology Induction Tzvetan Tchoukalov Christian Monson Brian Roark Multiple