Going off the grid Benjamin Recht University of California, Berkeley Joint work with Badri Bhaskar Parikshit Shah Gonnguo Tang
imaging astronomy seismology spectroscopy k X c j e i 2 π f j t x ( t ) = DOA Estimation j =1 Sampling GPS Radar Ultrasound
imaging astronomy seismology spectroscopy k X x ( t ) = c j g ( t − τ j ) DOA Estimation j =1 Sampling GPS Radar Ultrasound
imaging astronomy seismology spectroscopy k X c j g ( t − τ j ) e i 2 π f j t x ( t ) = DOA Estimation j =1 Sampling GPS Radar Ultrasound
imaging astronomy seismology spectroscopy 8 9 k < = X x ( s ) = PSF ~ c j δ ( s − s j ) DOA Estimation : ; j =1 Sampling GPS Radar Ultrasound
imaging astronomy seismology spectroscopy X k x ( ω ) = d c j e − i 2 πω s j ˆ PSF( ω ) DOA Estimation j =1 Sampling GPS Radar Ultrasound
Observe a sparse combination of sinusoids s X c k e i 2 π mu k for some u k ∈ [0 , 1) x m = k =1 Spectrum Estimation: find a combination of sinusoids agreeing with time series data Classic (1790...): Prony’s method Assume coefficients are positive for simplicity ∗ � � � � x 0 x 1 ¯ x 2 ¯ x 3 ¯ 1 1 r e i 2 π u k e i 2 π u k x 1 x 0 x 1 ¯ x 2 ¯ � � � � � = =: toep( x ) c k � e i 4 π u k � � e i 4 π u k � x 2 x 1 x 0 x 1 ¯ � � � � k =1 e i 6 π u k e i 6 π u k x 3 x 2 x 1 x 0 toep(x) is positive semidefinite, and any null vector • corresponds to a polynomial that vanishes at e i 2 π u k MUSIC, ESPRIT, Cadzow, etc. •
Observe a sparse combination of sinusoids s X c k e i 2 π mu k for some u k ∈ [0 , 1) x m = k =1 Spectrum Estimation: find a combination of sinusoids agreeing with time series data sparse Contemporary: x ≈ Fc n × N F ab = exp( i 2 π ab/N ) � x � Fc � 2 Solve with LASSO: minimize 2 + µ � c � 1
Observe a sparse combination of sinusoids s X c k e i 2 π mu k for some u k ∈ [0 , 1) x m = k =1 Spectrum Estimation: find a combination of sinusoids agreeing with time series data Classic Contemporary SVD gridding+L1 minimization robust grid free model selection quantitative theory need to know model order discretization error lack of quantitative theory basis mismatch unstable in practice numerical instability Can we bridge the gap?
Parsimonious Models rank ! ! ! model atoms weights ! • Search for best linear combination of fewest atoms • “rank” = fewest atoms needed to describe the model
Atomic Norms Given a basic set of atoms, , define the function A • k x k A = inf { t > 0 : x 2 t conv( A ) } ! ! When is centrosymmetric, we get a norm A • X X ! k x k A = inf { | c a | : x = c a a } ! a ∈ A a ∈ A ! k z k A minimize IDEA: ! subject to Φ z = y ! When does this work? •
Atomic Norm Minimization k z k A minimize IDEA: subject to Φ z = y Generalizes existing, powerful methods • Rigorous formula for developing new analysis • algorithms Precise, tight bounds on number of measurements • needed for model recovery One algorithm prototype for a myriad of data- • analysis applications Chandrasekaran, R, Parrilo, and Willsky
Spectrum Estimation Observe a sparse combination of sinusoids s X k e i 2 ⇡ mu ? for some x ? c ? u ? k ∈ [0 , 1) k m = k =1 y = x ? + ω Observe: (signal plus noise) Atomic Set e i θ e i 2 πφ + i θ θ ∈ [0 , 2 π ) , e i 4 πφ + i θ A = : φ ∈ [0 , 1) · · · e i 2 πφ n + i θ Classical techniques (Prony, Matrix Pencil, MUSIC, ESPRIT, Cadzow), use the fact that noiseless moment matrices are low-rank: ∗ 2 3 2 3 2 3 1 1 µ 0 µ 1 µ 2 µ 3 r e φ k i e φ k i ¯ µ 1 µ 0 µ 1 µ 2 X 6 7 6 7 6 7 = 5 ⌫ 0 α k 6 7 6 7 6 7 e 2 φ k i e 2 φ k i ¯ ¯ µ 2 µ 1 µ 0 µ 1 4 5 4 5 4 k =1 e 3 φ k i e 3 φ k i ¯ ¯ ¯ µ 3 µ 2 µ 1 µ 0
Atomic Norm for Spectrum Estimation k z k A minimize IDEA: k Φ z � y k δ subject to Atomic Set: e i θ e i 2 πφ + i θ θ ∈ [0 , 2 π ) , e i 4 πφ + i θ A = : φ ∈ [0 , 1) . . . e i 2 πφ n + i θ How do we solve the optimization problem? • Can we approximate the true signal from partial and • noisy measurements? Can we estimate the frequencies from partial and • noisy measurements?
Which atomic norm for sinusoids? s X c k e i 2 π mu k for some u k ∈ [0 , 1) x m = k =1 When the are positive c k ∗ 2 3 2 3 2 3 1 1 ¯ ¯ ¯ x 0 x 1 x 2 x 3 r e i 2 π u k e i 2 π u k ¯ ¯ x 1 x 0 x 1 x 2 X 6 7 6 7 6 7 = 5 ⌫ 0 c k 6 7 6 7 6 7 e i 4 π u k e i 4 π u k ¯ x 2 x 1 x 0 x 1 4 5 4 5 4 k =1 e i 6 π u k e i 6 π u k x 3 x 2 x 1 x 0 For general coefficients, the convex hull is • characterized by linear matrix inequalities (Toeplitz positive semidefinite) ⇢ t � � x ∗ ! 2 t + 1 1 k x k A = inf 2 w 0 : ⌫ 0 toep( w ) x ! ! Moment Curve of [t,t 2 ,t 3 ,t 4 ,...], t ∈ S 1 •
A ∪ { − A} A conv( A ) conv( A ∪ {−A} )
cos(2 πu 1 k ) / 2 + cos(2 πu 2 k ) / 2 cos(2 π u 1 k ) / 2 − cos(2 π u 2 k ) / 2 u 1 = 0 . 1413 u 2 = 0 . 1411
Nearly optimal rates Observe a sparse combination of sinusoids s X k e i 2 ⇡ mu ? x ? c ? for some k u ? k ∈ [0 , 1) m = k =1 y = x ? + ω Observe: (signal plus noise) p 6 = q d ( u p , u q ) ≥ 4 Assume frequencies are far apart: min n 2 k x � y k 2 1 Solve: 2 + µ k x k A x = arg min ˆ x 2 C σ 2 s log( n ) 1 Error Rate: x � x ? k 2 n k ˆ n No algorithm can do better than No algorithm can do better than 1 � C 0 σ 2 s log( n/s ) 2 � C 0 σ 2 s � 1 x � x ? k 2 x � x ? k 2 n k ˆ E n k ˆ 2 n n even if we knew all of the even if the frequencies are well- frequencies (u k* ) separated
Mean Square Error Performance Profile P ( β ) β Frequencies generated at Performance profile over all • • random with 1/n separation. parameter values and settings Random phases, fading For algorithm s: • amplitudes. P s ( β ) = # { p ∈ P : MSE s ( p ) ≤ β min s MSE s ( p ) } Cadzow and MUSIC provided • #( P ) model order. AST (Atomic Norm Soft Thresholding) and LASSO estimate noise power. Lower is better Higher is better
Extracting the decomposition How do you extract the frequencies? Look at the • dual norm: � � n ! � � X v k e 2 π iku k v k ∗ a ∈ A h a, v i = max � � A = max ! � � u ∈ [0 , 1) � � k =1 ! Dual norm is the maximum modulus of a polynomial • At optimality, maximum modulus is attained at the support of the signal works way better than Prony interpolation in practice
Aside: “the proof” How do you prove any compressed sensing-esque • result? Look at the dual norm: � � n � � X v k e 2 π iku k v k ∗ a ∈ A h a, v i = max � � A = max � � u ∈ [0 , 1) � � k =1 Generic proof: •Construct a polynomial such that the atoms you want maximize the polynomial. •Then prove that the polynomial is bounded everywhere else.
Localization Guarantees 1 Estimated Frequency True Frequency Dual Polynomial Magnitude 0.16/n Spurious Frequency 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Frequency k 2 ��� ( n ) � Spurious Amplitudes � | ˆ c l | ≤ C 1 σ n l : ˆ f l ∈ F � 2 � k 2 ��� ( n ) � f j ∈ T d ( f j , ˆ � | ˆ c l | f l ) ≤ C 2 σ ��� Weighted Frequency Deviation n l : ˆ f l ∈ N j � � k 2 ��� ( n ) � � � � � � Near region approximation c j − c l ˆ ≤ C 3 σ � � n � � l : ˆ � f l ∈ N j �
Frequency Localization Spurious Amplitudes Weighted Frequency Deviation Near region approximation � � � 2 k 2 ��� ( n ) � � k 2 ��� ( n ) � k 2 ��� ( n ) � � � � f j ∈ T d ( f j , ˆ | ˆ c l | ��� f l ) ≤ C 2 σ � � � � c j − ˆ c l ≤ C 3 σ | ˆ c l | ≤ C 1 σ � � n n n � � l : ˆ l : ˆ f l ∈ N j � f l ∈ N j � l : ˆ f l ∈ F 1 1 1 0.8 0.8 0.8 0.6 0.6 0.6 P s ( β ) P s ( β ) P s ( β ) 0.4 0.4 0.4 AST AST 0.2 0.2 AST 0.2 MUSIC MUSIC MUSIC Cadzow Cadzow Cadzow 0 0 0 20 40 60 80 100 100 200 300 400 500 5 10 15 20 β β β 5 100 0.04 AST AST AST MUSIC MUSIC MUSIC 4 80 Cadzow Cadzow Cadzow 0.03 3 k = 32 60 k = 32 k = 32 m 3 m 1 m 2 0.02 40 2 0.01 20 1 0 0 0 − 10 − 5 0 5 10 15 20 − 10 − 5 0 5 10 15 20 − 10 − 5 0 5 10 15 20 SNR (dB) SNR (dB) SNR (dB)
Incomplete Data/Random Sampling • Observe a random subset of samples ! T ⊂ { 0 , 1 , . . . , n − 1 } ! • On the grid, Candes, Romberg & Tao (2004) ! • Full observation Off the grid, new, Compressed Sensing extended to the wide continuous domain ! • Recover the missing part by solving ! k z k A minimize ! z ! Random sampling subject to z T = x T . ! • Extract frequencies from the dual optimal solution
Recommend
More recommend