Structured random measurements in compressed sensing Holger Rauhut Lehrstuhl C f¨ ur Mathematik (Analysis) RWTH Aachen Winter School on Compressed Sensing TU Berlin December 3–5, 2014 1 / 82
Too Few Data Often it is hard, expensive, time consuming or otherwise difficult to collect enough measurements in order to reconstruct/acquire a signal. Magnetic Resonance Wireless communication Imaging Radar 2 / 82
Compressive sensing Reconstruction of signals from minimal amount of measured data Key ingredients ◮ Compressibility / Sparsity (small complexity of relevant information) ◮ Efficient algorithms (convex optimization) ◮ Randomness (random matrices) Here: Structured random matrices 3 / 82
Sparsity / Compressibility Data Compression Most types of signals can be represented well by a sparse expansion, i.e., with only few nonzero coefficients in an appropriate basis (JPEG, MPEG, MP3 etc.). Compressive Sensing / Sparse Recovery Sparse / Compressible signals can be recovered from only few linear measurements via efficient algorithms 4 / 82
Fourier-Coefficients Time-Domain Signal with 30 Samples Traditional Reconstruction Sparse Recovery Method 5 / 82
Mathematical formulation Recover a vector x ∈ C N from underdetermined linear measurements A ∈ C m × N , y = A x , where m ≪ N . Key finding of compressive sensing: Recovery is possible if x belongs to a set of low complexity. ◮ Standard compressive sensing: Sparsity (small number of nonzero coefficients) ◮ Low rank matrix recovery ◮ Phase retrieval ◮ Low rank tensor recovery ◮ Only partial results for tensor recovery available so far. 6 / 82
Sparsity and Compressibility ◮ coefficient vector: x ∈ C N , N ∈ N ◮ support of x : supp x := { j , x j � = 0 } ◮ s - sparse vectors: � x � 0 := | supp x | ≤ s . s -term approximation error σ s ( x ) q := inf {� x − z � q , z is s -sparse } , 0 < q ≤ ∞ . x is called compressible if σ s ( x ) q decays quickly in s . Recall � x � q = ( � N j =1 | x j | q ) 1 / q Stechkin estimate, for p < q , σ s ( x ) q ≤ s 1 / q − 1 / p � x � p . The unit balls B N p = { x ∈ C N , � x � p ≤ 1 } , 0 < p ≤ 1, are good models for compressible signals. 7 / 82
Compressive Sensing Problem Reconstruct an s -sparse vector x ∈ C N (or a compressible vector) from its vector y of m measurements A ∈ C m × N . y = A x , Interesting case: s < m ≪ N . Preferably fast reconstruction algorithm! 8 / 82
ℓ 1 -minimization ℓ 0 -minimization is NP-hard, x ∈ C N � x � 0 min subject to A x = y . ℓ 1 minimization: min x � x � 1 subject to A x = y Convex relaxation of ℓ 0 -minimization problem. Efficient minimization methods available. Alternatives: Greedy Algorithms (Matching Pursuits) Iterative hard thresholding Iteratively reweighted least squares 9 / 82
A typical result in compressive sensing For a draw of a Gaussian random matrix A ∈ R m × N an s -sparse vector x ∈ R N can be recovered via ℓ 1 -minimization (and other algorithms) with high probability from y = Ax provided m ≥ Cs ln( eN / s ) . Similar results for certain structured random matrices (random partial Fourier, partial random circulant,...) 10 / 82
Recovery conditions 11 / 82
Uniform vs. nonuniform recovery Often recovery results are for random matrices A ∈ R m × N . ◮ Uniform recovery With high probability on A every sparse vector (low rank matrix etc.) is recovered; P ( ∀ s -sparse x , recovery of x is successful using A ) ≥ 1 − ε. Recovery conditions on A ◮ Null space property ◮ Restricted isometry property ◮ Nonuniform recovery A fixed sparse vector (low rank matrix etc.) is recovered with high probability using A ∈ R m × N ; ∀ s -sparse x : P (recovery of x is successful using A ) ≥ 1 − ε. Recovery conditions on A ◮ Tangent cone (descent cone) of norm at x intersects ker A trivially. ◮ Dual certificates 12 / 82
Recovery via tangent cones For x ∈ R N , consider optimization problem min � z � subject to Az = Ax . Tangent cone (descent cone) T ( x ) = cone { z − x : z ∈ R N , � z � ≤ � x �} . Theorem x is recovered if ker A ∩ T ( x ) = { 0 } . 13 / 82
Recovery from Gaussian random matrices Entries of A ∈ R m × N : independent standard random variables Recovery guarantees via Gaussian width: For a set T ∈ R n and a standard Gaussian vector g ∈ R n , ℓ ( T ) = E sup � x , g � x ∈ T Theorem (Chandrasekaran, Parillo, Recht, Willsky 2010) A vector x ∈ R N is recovered from y = Ax via min � z � s.t. Az = y with high probability if m � ℓ ( T ( x ) ∩ S N − 1 ) . Based on Gordon’s escape through a mesh theorem (Gordon’s comparison theorem and concentration of measure for Lipschitz functions) s -sparse vector recovery via ℓ 1 -minimization if m � 2 s ln( eN / s ) . 14 / 82
Dual certificate Theorem (Fuchs 2004, Tropp 2005) Let A ∈ C m × N , a vector x ∈ C N with support S is the unique solution of min � z � 1 subject to A z = A x if one of the following equivalent conditions holds: � � � � � (a) sgn( x j ) v j � < � v S � 1 for all v ∈ ker A \ { 0 } , � j ∈ S (b) A S is injective and there exists a dual vector h ∈ C m such that ( A ∗ h ) j = sgn( x j ) , | ( A ∗ h ) ℓ | < 1 , j ∈ S , ℓ ∈ S . 15 / 82
Dual certificate (II) Corollary Let a 1 , . . . , a N be the columns of A ∈ C m × N . For x ∈ C N with support S, if the matrix A S is injective and if |� A † S a ℓ , sgn( x S ) �| < 1 for all ℓ ∈ S , then the vector x is the unique ℓ 1 -minimizer with y = A x . Here, A † S is Moore-Penrose pseudo inverse. One ingredient: Check that A S is well-conditioned, i.e., � A ∗ S A S − I � 2 → 2 ≤ δ < 1 . 16 / 82
Null space property Null space property: necessary and sufficient for exact recovery of all s -sparse vectors via ℓ 1 -minimization with A , � v S � 1 ≤ ρ � v S c � 1 for all v ∈ ker A , S ⊂ { 1 , . . . N } , | S | = s for some 0 < ρ < 1. Implies also stability of reconstruction: � x − x ♯ � 1 ≤ 2(1 + ρ ) 1 − ρ σ s ( x ) 1 17 / 82
Restricted Isometry Property (RIP) Definition The restricted isometry constant δ s of a matrix A ∈ C m × N is defined as the smallest δ s such that (1 − δ s ) � x � 2 2 ≤ � A x � 2 2 ≤ (1 + δ s ) � x � 2 2 for all s -sparse x ∈ C N . Requires that all s -column submatrices of A are well-conditioned. 18 / 82
RIP implies recovery by ℓ 1 -minimization Theorem (Cand` es, Romberg, Tao ’04 – Cand` es ’08 – Foucart, Lai ’09 – Foucart ’09/’12 – Li, Mo ’11 – Andersson, Str¨ omberg ’12 – Cai, Zhang ’13) Assume that the restricted isometry constant of A ∈ C m × N satisfies √ δ 2 s < 1 / 2 ≈ 0 . 7071 . Then ℓ 1 -minimization reconstructs every s-sparse vector x ∈ C N from y = A x . Extends to low rank matrix recovery via nuclear norm minimization 19 / 82
Stability Theorem (Cand` es, Romberg, Tao ’04 – Cand` es ’08 – Foucart, Lai ’09 – Foucart ’09/’12 – Li, Mo ’11 – Andersson, Str¨ omberg ’12 – Cai, Zhang ’13) √ Let A ∈ C m × N with δ 2 s < 1 / 2 ≈ 0 . 7071 . Let x ∈ C N , and assume that noisy data are observed, y = A x + η with � η � 2 ≤ σ . Let x # by a solution of min z � z � 1 such that � A z − y � 2 ≤ σ. Then � x − x # � 2 ≤ C σ s ( x ) 1 √ s + D σ and � x − x # � 1 ≤ C σ s ( x ) 1 + D √ s σ for constants C , D > 0 , that depend only on δ 2 s . 20 / 82
Matrices satisfying the RIP Open problem: Give explicit matrices A ∈ C m × N with small δ 2 s ≤ 0 . 7 for “large” s . Goal: δ s ≤ δ , if m ≥ C δ s ln α ( N ) , for constants C δ and α . Deterministic matrices known, for which m ≥ C δ, k s 2 suffices if N ≤ m k . Way out: consider random matrices. 21 / 82
Random matrices 22 / 82
RIP for Gaussian and Bernoulli matrices Gaussian: entries of A independent N (0 , 1) random variables Bernoulli : entries of A independent Bernoulli ± 1 distributed rv Theorem Let A ∈ R m × N be a Gaussian or Bernoulli random matrix and assume m ≥ C δ − 2 ( s ln( eN / s ) + ln(2 ε − 1 )) for a universal constant C > 0 . Then with probability at least 1 1 − ε the restricted isometry constant of √ m A satisfies δ s ≤ δ . Consequence: Recovery via ℓ 1 -minimization with probability exceeding 1 − e − cm provided m ≥ Cs ln( eN / s ) . Bound is optimal as follows from lower bound for Gelfand widths of ℓ p -balls, 0 < p ≤ 1. (Gluskin, Garnaev 1984 — Foucart, Pajor, R, Ullrich 2010) 23 / 82
Structured Random Measurements 24 / 82
Structured Random Measurements Why structure? ◮ Applications impose structure due to physical constraints, limited freedom to inject randomness. ◮ Fast matrix vector multiplies (FFT) in recovery algorithms, unstructured random matrices impracticable for large scale applications. ◮ Storage problems for unstructured matrices. 25 / 82
Random Sampling 26 / 82
Recommend
More recommend