fast l0 based sparse signal recovery
play

Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong - PowerPoint PPT Presentation

Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong Zhang Signal Processing Group Department of Engineering ACVT seminar, Adelaide May 2011 Outline Introduction 1 L0 reweighted-L2 Minimization 2 IRLS Basic Model Basic


  1. Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong Zhang Signal Processing Group Department of Engineering ACVT seminar, Adelaide – May 2011

  2. Outline Introduction 1 L0 reweighted-L2 Minimization 2 IRLS Basic Model Basic Algorithm Fast Algorithm L 0 RL 2 Continuation Strategy 3 Geometry of the penalty function L 0 RL 2 with continuation Numerical Results 4 Super-resolution 5 Experimental results 6 Conclusion 7 Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 2 / 28 L 0 RL 2

  3. Introduction Introduction The L 0 minimisation problem min � x � 0 subject to � y − Φ x � 2 ≤ ξ Find x which gives (1) where Φ is known and of size M × N with M < N L 0 : NP-hard combinatorial search for exact solution L 1 : convex problem, but poor computation speed for large problems; greedy algorithms can be used, e.g. CoSaMP, NESTA. L p : iterative reweighted techniques, e.g. IRL1, IRLS L 0 : greedy algorithms usually used, e.g. IHT, ℓ 0 -AP, SL0 We propose a highly efficient form of IRLS which approximates L 0 minimisation, allows N ∼ 10 7 or 10 8 , and has good reconstruction performance. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 3 / 28 L 0 RL 2

  4. L 0 RL 2 IRLS Iterative reweighted least-squares minimization (IRLS) min x T Sx subject to � y − Φ x � 2 = 0 S is a diagonal weight matrix. The solution of each step is the solution of min � v � 2 subject to � y − Φ x � 2 = 0 and x = S − 1 2 v which is unique and the reweighted iterative rule is: x = S − 1 2 (Φ S − 1 2 ) † y , (2) where † denotes the pseudo inverse (i.e. H † = ( H T H ) − 1 H T ). The inverse of ( H T H ) is difficult for large problems. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 4 / 28 L 0 RL 2

  5. L 0 RL 2 IRLS Iterative reweighted least squares minimizations (IRLS) 1 for 0 ≤ p ≤ 1 . Gorodnitsky & Rao (1997) S j = 2 ) , (1 − p ( | x j | 2 ) 1 Chartrand & Yin (2008) S j = 2 ) , for 0 ≤ p ≤ 1 (1 − p ( | x j | 2 + ǫ 2 ) Daubechies et al (2009) ◮ Theoretical analysis shows that local convergence rate of the algorithm is superlinear; smaller p values result in faster convergence rate. ◮ Experimentally shows that L p norm with p → 0 can help to achieve a higher success rate in exact signal recovery. However, IRLS in this form strictly only models noise-free observations. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 5 / 28 L 0 RL 2

  6. L 0 RL 2 Basic Model Basic Model – Gaussian measurement noise Consider the noisy system y = Φ x + n (3) where we assume : n ∼ N (0 , ν 2 ) is noise Φ is a known M × N matrix, and the prior of x is a zero-mean scaled adaptive Gaussian model such that � | S | exp( − 1 2 x T Sx ) , p ( x ) ∝ where S is a diagonal matrix, whose j th diagonal entry S j = 1 /σ 2 j . This is well suited for modeling wavelet coefs. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 6 / 28 L 0 RL 2

  7. L 0 RL 2 Basic Model Basic Model – Prior for σ j We further assume an independent prior ∝ exp( − 1 2 ǫ 2 /σ 2 j ) for each σ j . 1 This prior can be regarded as an 0.9 0.8 approximation to the lower bounded 0.7 uniform prior U ( ǫ, + ∞ ). It is exp(− ε 2 /x 2 ) 0.6 approximately uniformly distributed 0.5 in the region (3 ǫ, + ∞ ). Meanwhile it 0.4 tends to 0 as σ j approaches 0 so as 0.3 to prevent σ 2 getting too small and 0.2 exp(− ε 2 /x 2 ) 0.1 U( ε ,+ ∞ ) hence avoid numerical instability in 0 0 x 1 ε 3 ε 5 ε 7 ε 9 ε the model. Figure: Illustration of the prior for σ j . Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 7 / 28 L 0 RL 2

  8. L 0 RL 2 Basic Algorithm Log MAP function and basic algorithm The prior on σ j gives the following negative log MAP function: Negative log MAP function   J ( x , S ) = ν 2  x T Sx − ln | S | + ǫ 2 �  + � y − Φ x � 2 S j (4) j Minimizing J ( x , S ) results in the following iteration rules: Basic algorithm J ( x , S ) = (Φ T Φ + ν 2 S ) − 1 Φ T y  x = arg min  x    J ( x , S ) = | x j | 2 + ǫ 2 σ 2  j = arg min   (5) σ 2 j  S j = 1 1   = ∀ j   | x j | 2 + ǫ 2 σ 2   j Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 8 / 28 L 0 RL 2

  9. L 0 RL 2 Fast Algorithm L 0 RL 2 Fast Algorithm L 0 RL 2 – to avoid (Φ T Φ + ν 2 S ) − 1 For wavelet-like signal spaces, we introduce the vector α = [ α 1 . . . α K ] and the diagonal operator Λ α that multiplies the k th subspace/subband by α k : (Λ α x ) k = α k x k for k = 1 · · · K where x k is a masked version of x with non-zeros only in the subband k , and where α k may be optimized independently for each subband to be the minimum α k such that k x k ≥ � Φ x k � 2 α k x T for any k and x . Then using majorisation minimization (MM) [3], we have the following new auxiliary function: J α ( x , S , z ) = J ( x , S ) + ( x − z ) T Λ α ( x − z ) − � Φ( x − z ) � 2 (6) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 9 / 28 L 0 RL 2

  10. L 0 RL 2 Fast Algorithm L 0 RL 2 Fast Algorithm L 0 RL 2 The new auxiliary function eliminates the difficult x T Φ T Φ x term from J ( x , S ) and results in the following iteration rules: L 0 RL 2 x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Note that (Λ α + ν 2 S n ) is now a diagonal matrix and hence is easy to invert! Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 10 / 28 L 0 RL 2

  11. L 0 RL 2 Fast Algorithm L 0 RL 2 J ( x , S ) = ν 2 � � x T Sx − ln | S | + ǫ 2 � + � y − Φ x � 2 (4) j S j x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Substituting eq(7b) into log MAP eq(4) gives cost function J ( x , S ) = ν 2 � � � j ln( x 2 j + ǫ 2 ) /ǫ + � y − Φ x � 2 const + (9) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 11 / 28 L 0 RL 2

  12. L 0 RL 2 Fast Algorithm L 0 RL 2 J ( x , S ) = ν 2 � � x T Sx − ln | S | + ǫ 2 � + � y − Φ x � 2 (4) j S j x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Substituting eq(7b) into log MAP eq(4) gives cost function J ( x , S ) = ν 2 � � � j ln( x 2 j + ǫ 2 ) /ǫ + � y − Φ x � 2 const + (9) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 11 / 28 L 0 RL 2

  13. Continuation Strategy Geometry of the penalty function Geometry of the log-sum penalty function f ln ,ǫ f ln ,ǫ = C ln x 2 + ǫ 2 with ǫ = 0 . 1 , compared to � x � 1 and thresholded � x � 0 . ǫ 2 1.6 0.15 L1 1.4 f ln, ε 1.2 0.1 1 penalty penalty 0.8 0.6 0.05 0.4 f ln, ε 0.2 L1 Thresholded L0 0 − ε ε 0 −0.1 −0.05 0 0.05 0.1 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 Figure: Compared to the L1 norm, the geometry of the log-sum penalty function lends itself well to detecting sparsity as ǫ → 0. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 12 / 28 L 0 RL 2

  14. Continuation Strategy Geometry of the penalty function Geometry: unit ball of f ln ,ǫ ( x ) 1 ε 2 = 1 ← L2 ball 0.8 ← ε 2 = 10 ε 2 = 0.1 0.6 0.4 ε 2 = 0.01 0.2 ε 2 = 0.001 ε 2 = 0.0001 0 0 0.2 0.4 0.6 0.8 1 Figure: The effect of ǫ on the geometry of the unit ball of f ln ,ǫ ( x ). When ǫ is large, the geometry of f ln ,ǫ approximates the L2 ball; when ǫ becomes small, the geometry of f ln ,ǫ approaches that of the L0 norm. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 13 / 28 L 0 RL 2

  15. Continuation Strategy L 0 RL 2 with continuation Acceleration and parameter selection Now we consider the minimization of our problem: ln | x j | 2 + ǫ 2 F ǫ,ν ( x ) = ν 2 � + � y − Φ x � 2 , (10) ǫ j where there are two parameters which decide the solution path. ν balances the fidelity and sparsity, and ǫ decides the geometry of the penalty function. We formulate a sequence of such problems F ǫ n ,ν n ( x ): ln | x j | 2 + ǫ 2 � F ǫ n ,ν n ( x ) = ν 2 n + � y − Φ x � 2 (11) n ǫ n j starting from large ν 0 and ǫ 0 , then simultaneously reducing ν n and ǫ n until ν n = ν and ǫ n = ǫ . It is easy to see that F ǫ n ,ν n ( x ) continuously deforms to F ǫ,ν ( x ). We try to ensure that the path of the global minima of F ǫ n ,ν n ( x ) leads to the global minimum of F ǫ,ν ( x ). Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 14 / 28 L 0 RL 2

Recommend


More recommend