July 2013 Robust Spectral Compressed Sensing via Structured Matrix Completion Yuxin Chen Electrical Engineering, Stanford University Joint Work with Yuejie Chi Page 1
Sparse Fourier Representation/Approximation Fourier representation of a signal: r � d i e j 2 π � t , f i � x ( t ) = i =1 ( f i : frequencies, d i : amplitudes, r : model order) • Sparsity: nature is (approximately) sparse (small r ) • Goal: identify the underlying frequencies from time-domain samples 10 5 0 5 10 Page 2
Applications in Sensing • Multipath channels: a (relatively) small number of strong paths. • Radar Target Identification: a (relatively) small number of strong scatters. Page 3
Applications in Imaging • Consider a time-sparse signal (a dual problem) r � z ( t ) = d i δ ( t − t i ) i =1 • Resolution is limited by the point spread function of the imaging system image = z ( t ) ∗ PSF ( t ) Page 4
Data Model • Signal Model: a mixture of K -dimensional sinusoids at r distinct frequencies r � d i e j 2 π � t , f i � x ( t ) = i =1 where f i ∈ [0 , 1] K : frequencies; d i : amplitudes. • Observed Data: X = [ x i 1 ,...,i K ] ∈ C n 1 ×···× n K – Continuous dictionary: f i can assume ANY value in a unit disk – Multi-dimensional model: f i can be multi-dimensional – Low-rate Data Acquisition: obtain partial samples of X • Goal: Identify the frequencies from partial measurements Page 5
Prior Art • Parametric Estimation: ( shift-invariance of harmonic structures ) ◦ Prony’s method, ESPRIT [RoyKailath’1989], Matrix Pencil [Hua’1992], Finite rate of innovation [DragottiVetterliBlu’2007][GedalyahuTurEldar’2011]... ◦ perfect recovery from equi-spaced O ( r ) samples ◦ sensitive to noise and outliers ◦ require prior knowledge on the model order. • Compressed Sensing: ◦ Discretize the frequency and assume a sparse representation � 0 � 0 � � , . . . , n 1 − 1 , . . . , n 2 − 1 f i ∈ F = × × . . . n 1 n 1 n 2 n 2 ◦ perfect recovery from O ( r log n ) random samples ◦ non-parametric approach ◦ robust against noise and outliers ◦ sensitive to gridding error Page 6
Basis Mismatch / Gridding Error • A toy example: x ( t ) = e j 2 πf 0 t : ◦ The success of CS relies on sparsity in the DFT basis. ◦ Basis mismatch: discrete v.s. continuous dictionary ∗ Mismtach ⇒ kernel leakage ⇒ failure of CS (basis pursuit) 1 1 0 0 −1 −1 0 200 400 0 200 400 Mismatch ∆θ =0.1 π /N Normalized recovery error=0.0816 1 1 0 0 −1 −1 0 200 400 0 200 400 Mismatch ∆θ =0.5 π /N Normalized recovery error=0.3461 1 1 0 0 −1 −1 0 200 400 0 200 400 Mismatch ∆θ = π /N Normalized recovery error=1.0873 ◦ Finer gridding does not help [ChiScharfPezeshkiCalderbank’2011] Page 7
Two Recent Landmarks in Off-the-grid Harmonic Retrieval (1-D) • Super-Resolution (CandesFernandezGranda’2012) ◦ Low-pass measurements ◦ Total-variation norm minimization • Compressed Sensing Off the Grid (TangBhaskarShahRecht’2012) ◦ Random measurements ◦ Atomic norm minimization ◦ Require only O ( r log r log n ) samples • QUESTIONS: 1 ◦ How to deal with multi-dimensional frequencies ? 0.8 1 ◦ Robustness against outliers ? amplitude 0.5 0.6 0 0 0.4 0.2 0.4 0.2 0.6 Page 8 0.8 0 1
Our Objective 8 6 4 2 real part 1 0 0.8 1 amplitude − 2 0.5 0.6 0 − 4 0 0.4 0.2 clean signal − 6 0.4 recovered signal 0.2 0.6 recovered corruptions 0.8 − 8 0 10 20 30 40 50 60 0 1 data index • Goal: seek an algorithm of the following properties ◦ non-parametric ◦ works for multi-dimensional frequency model ◦ works for off-the-grid frequencies ◦ requires a minimal number of measurements ◦ robust against noise and sparse outliers Page 9
Concrete Example: 2-D Frequency Model recall that x ( t ) = � r i =1 d i e j 2 π � t , f i � • For 2-D frequencies, we have the Vandermonde decomposition : · Z T . X = Y · D ���� diagonal matrix Here, D := diag { d 1 , · · · , d r } and 1 1 · · · 1 1 1 · · · 1 · · · · · · y 1 y 2 y r z 1 z 2 z r Y := , Z := . . . . . . . . . . . . . . . . . . . . . . . . y n 1 − 1 y n 1 − 1 y n 1 − 1 z n 2 − 1 z n 2 − 1 z n 2 − 1 · · · · · · r r 1 2 1 2 � �� � � �� � Vandemonde matrix Vandemonde matrix where y i = exp( j 2 πf 1 i ) , z i = exp( j 2 πf 2 i ) . ◦ Spectral Sparsity ⇒ X may be low-rank for very small r ◦ Reduced-rate Sampling ⇒ observe partial entries of X Page 10
Matrix Completion? Z T recall that X = · · . Y D ���� ���� ���� Vandemonde diagonal Vandemonde where D := diag { d 1 , · · · , d r } , and 1 1 · · · 1 1 1 · · · 1 · · · · · · y 1 y 2 y r z 1 z 2 z r Y := . . . . , Z := . . . . . . . . . . . . . . . . . . . . y n 1 − 1 y n 1 − 1 y n 1 − 1 z n 2 − 1 z n 2 − 1 z n 2 − 1 · · · · · · r r 1 2 1 2 • Question: can we apply Matrix Completion algorithms directly on X ? • Yes, but it yields sub-optimal performance. √ √ √ ? ? √ √ √ ? ? ◦ requires at least r max { n 1 , n 2 } samples. √ √ ? ? ? ◦ X is no longer low-rank if r > min ( n 1 , n 2 ) √ √ √ √ ? ∗ note that r can be as large as n 1 n 2 √ √ √ ? ? � �� � • Call for more effective forms. Page 11
Rethink Matrix Pencil: Matrix Enhancement • An enhanced form X e : ( k 1 × ( n 1 − k 1 + 1) block Hankel matrix [Hua’1992] ) · · · X 0 X 1 X n 1 − k 1 X 1 X 2 · · · X n 1 − k 1+1 X e = , . . . . . . . . . . . . X k 1 − 1 X k 1 · · · X n 1 − 1 where each block is a k 2 × ( n 2 − k 2 + 1) Hankel matrix as follows 5 · · · x l, 0 x l, 1 x l,n 2 − k 2 10 · · · x l, 1 x l, 2 x l,n 2 − k 2+1 X l = . . . . . . . . . 15 . . . . · · · x l,k 2 − 1 x l,k 2 x l,n 2 − 1 20 25 30 • Incentive : 35 5 10 15 20 25 30 35 ◦ Lift the matrix to promote Harmonic Structure ◦ Convert Sparsity to Low Rank Page 12
Low-Rank Structure of the Enhanced Matrix • The enhanced matrix can be decomposed as follows. Z L � � Z L Y d Z R , Y d Z R , · · · , Y n 1 − k 1 X e = , D Z R . . . d Z L Y k 1 − 1 d ◦ Z L and Z R are Vandermonde matrices specified by z 1 , . . . , z r , ◦ Y d = diag [ y 1 , y 2 , · · · , y r ] . • The enhanced form X e is low-rank. ◦ rank ( X e ) ≤ r ◦ Spectral Sparsity ⇒ Low Rank Page 13
Enhancement Matrix Completion (EMaC) • Our recovery algorithm: Enhanced Matrix Completion (EMaC) (EMaC) : minimize � M e � ∗ M ∈ C n 1 × n 2 subject to M i,j = X i,j , ∀ ( i, j ) ∈ Ω where Ω denotes the sampling set, and � · � denotes the nuclear norm. ◦ nuclear norm minimization (convex) ? √ √ ? √ ? √ √ ? ? √ √ √ √ ? ? ? √ √ ? ? √ √ ? √ ? ? √ √ √ ? √ √ √ ? √ • existing MC result won’t apply – ? ? √ ? √ ? √ ? √ ? √ √ requires at least O ( nr ) samples √ ? √ √ ? ? √ √ √ ? √ √ ? √ √ ? ? √ √ ? ? √ √ ? √ √ ? √ √ √ ? √ √ √ ? √ √ ? √ ? √ ? √ √ √ ? √ ? • Question : How many samples do ? ? √ √ √ ? √ √ ? ? √ ? we need? ? √ √ ? ? √ √ ? ? √ ? ? √ √ ? √ √ √ ? √ √ ? ? √ √ ? √ √ √ ? √ ? ? ? √ ? Page 14
Coherence Measures • Notations: G L is an r × r Gram matrices such that � y ( i ) , y ( l ) � � z ( i ) , z ( l ) � ( G L ) il := where y ( i ) := (1 , y i , y 2 i , · · · , y k 1 − 1 ) and y i := e j 2 πf i . i z ( i ) and G R are similarly defined with different dimensions... Dirichlet Kernel • Incoherence property arises w.r.t. µ 1 if σ min ( G L ) ≥ 1 σ min ( G R ) ≥ 1 , . µ 1 µ 1 • Examples: ◦ Randomly generated frequencies ◦ (Mild) perturbation of grid points Page 15
Theoretical Guarantees for Noiseless Case • Theorem 1 (Noiseless Samples) Let n = n 1 n 2 . If all measurements are noiseless, then EMaC recovers X with high probability if: m ∼ Θ( µ 1 r log 3 n ); • Implications ◦ minimum sample complexity: O ( r log 3 n ) . ◦ general theoretical guarantees for Hankel (Toeplitz) matrix completion. — see applications in control, MRI, natural languange processing, etc Page 16
Recommend
More recommend