Iteratively Reweighted ℓ 1 Approaches to Sparse Composite Regularization Phil Schniter Joint work with Prof. Rizwan Ahmad (OSU) Supported in part by NSF grant CCF-1018368. MATHEON Conf. on Compressed Sensing and its Applications TU-Berlin — Dec 11, 2015
Introduction and Motivation for Composite Penalties Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 2 / 29
Introduction and Motivation for Composite Penalties Introduction Goal: Recover signal x ∈ C N from noisy linear measurements y = Φ x + w ∈ C M where usually M ≪ N . Approach: Solve the optimization problem x γ � y − Φ x � 2 x = arg min ˆ 2 + R ( x ) , with γ > 0 controlling the measurement fidelity. Question: How should we choose penalty/regularization R ( x ) ? Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 3 / 29
Introduction and Motivation for Composite Penalties Typical Choices of Penalty Say Ψ x is (approximately) sparse for “analysis operator” Ψ ∈ C L × N . ℓ 0 penalty: R ( x ) = � Ψ x � 0 Impractical: optimization problem is NP hard ℓ 1 penalty (generalized LASSO): R ( x ) = � Ψ x � 1 Tightest convex relaxation of ℓ 0 penalty Fast algorithms: ADMM, MFISTA, NESTA-UP, grAMPa . . . non-convex penalties R ( x ) = � Ψ x � p for p ∈ (0 , 1) (via IRW-L2) R ( x ) = � L l =1 log( ǫ + | ψ T l x | ) with ǫ ≥ 0 (via IRW-L1) many others... Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 4 / 29
Introduction and Motivation for Composite Penalties Choice of Analysis Operator How to choose Ψ in practice? Maybe a wavelet transform? Which one? Ψ 1 . . (e.g., SARA 1 )? Maybe a concatenation of several transforms . Ψ D What if signal is more sparse in one dictionary than another? Can we compensate for this? Can we exploit this? 1 Carrillo, McEwen, Van De Ville, Thiran, Wiaux, “Sparsity averaged reweighted analysis,” IEEE SPL , 2013 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 5 / 29
Introduction and Motivation for Composite Penalties Example: Undecimated Wavelet Transform of MRI Cine Note different sparsity rate in each subband of 1-level UWT: Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 6 / 29
Introduction and Motivation for Composite Penalties Composite ℓ 1 Penalties We propose to use composite ℓ 1 (Co-L1) penalties of the form D � R ( x ; λ ) � λ d � Ψ d x � 1 , λ d ≥ 0 d =1 where Ψ d ∈ C L d × N have unit-norm rows. The Ψ d could be chosen, for example, as different DWTs (i.e., db1,db2,db3,. . . ,db10), different subbands of a given DWT, row-subsets of I (i.e., group/hierarchical sparsity), or all of the above. We then aim to simultaneously tune the weights { λ d } and recover the signal x . Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 7 / 29
Introduction and Motivation for Composite Penalties The Co-L1 Algorithm { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 , ǫ ≥ 0 2: if Ψ d x ∈ R L d then C d = 1 , elseif Ψ d x ∈ C L d then C d = 2 . λ (1) 3: initialization: = 1 ∀ d d 4: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) γ � y − Φ x � 2 5: 2 + d � Ψ d x � 1 x d =1 C d L d λ ( t +1) ← , d = 1 , . . . , D 6: d ǫ + � Ψ d x ( t ) � 1 7: end 8: output: x ( t ) leverages existing ℓ 1 solvers (e.g., ADMM, MFISTA, NESTA-UP, grAMPa), reduces to the IRW-L1 algorithm [Figueiredo,Nowak’07] when L d = 1 ∀ d (single-atom dictionaries). applies to both real- and complex-valued cases, Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 8 / 29
Introduction and Motivation for Composite Penalties The Co-IRW-L1 Algorithm { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 λ (1) = 1 ∀ d , W (1) 2: initialization: = I ∀ d d d 3: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) d � W ( t ) γ � y − Φ x � 2 4: 2 + d Ψ d x � 1 x d =1 ( λ ( t +1) , ǫ ( t +1) λ d ∈ Λ ,ǫ d > 0 log p ( x ( t ) ; λ , ǫ ) , d = 1 , ..., D ) ← arg max 5: d d � � 1 1 W ( t +1) ← diag 6: , · · · , , d = 1 , ..., D d ǫ ( t +1) ǫ ( t +1) + | ψ T + | ψ T d, 1 x ( t ) | d,L d x ( t ) | d d 7: end 8: output: x ( t ) tunes both λ d and diagonal W d for all d : hierarchical weighting. also tunes regularization parameters ǫ d for all d . Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 9 / 29
Introduction and Motivation for Composite Penalties Understanding Co-L1 and Co-IRW-L1 In the sequel, we provide four interpretations of each algorithm: 1 Majorization-minimization (MM) for a particular non-convex penalty, 2 a particular approximation of ℓ 0 minimization, 3 Bayesian estimation according to a particular hierarchical prior, 4 variational EM algorithm under a particular prior. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 10 / 29
Co-L1 and its Interpretations Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 11 / 29
Co-L1 and its Interpretations Optimization Interpretations of Co-L1 Co-L1 is an MM approach to the weighted log-sum optimization problem D � � γ � y − Φ x � 2 � arg min 2 + L d log( ǫ + � Ψ d x � 1 ) x d =1 and As ǫ → 0 , Co-L1 aims to solve the weighted ℓ 1 , 0 problem D � � γ � y − Φ x � 2 � arg min 2 + L d 1 � Ψ d x � 1 > 0 x d =1 Note: L d is # atoms in dictionary Ψ d , and 1 � is the indicator function. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 12 / 29
Co-L1 and its Interpretations Approximate- ℓ 0 Interpretation of Log-Sum Penalty 1.5 N eps=1e-13 1 eps=0.001 � eps=0.1 log( ǫ + | u n | ) ell1 ell0 log(1 /ǫ ) n =1 1 � 1 � = log( ǫ ) log(1 /ǫ ) n : x n =0 � 0.5 � + log( ǫ + | u n | ) n : x n � =0 � n : x n � =0 log( ǫ + | u n | ) 0 -1.5 -1 -0.5 0 0.5 1 1.5 = � x � 0 − N + log(1 /ǫ ) As ǫ → 0 , the log-sum penalty becomes a scaled and shifted version of the ℓ 0 penalty. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 13 / 29
Co-L1 and its Interpretations Bayesian Interpretations of Co-L1 Co-L1 is an MM approach to Bayesian MAP estimation under an AWGN likelihood and the hierarchical prior D � L d � λ d � � � p ( x | λ ) = exp − λ d � Ψ d x � 1 i.i.d. Laplacian 2 d =1 D � � 0 , 1 i.i.d. Gamma � p ( λ ) = Γ , ǫ (i.i.d. Jeffrey’s as ǫ → 0 ) d =1 and As ǫ → 0 , Co-L1 is a variational EM approach to estimating (determin- istic) λ under an AWGN likelihood and the prior D � L d � λ d � � � p ( x ; λ ) = exp − λ d ( � Ψ d x � 1 + ǫ ) i.i.d. Laplacian as ǫ → 0 2 d =1 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 14 / 29
Co-IRW-L1 and its Interpretations Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 15 / 29
Co-IRW-L1 and its Interpretations A Simplified Version of Co-IRW-L1 Consider the real-valued and fixed- ǫ d variant of Co-IRW-L1. { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 , ǫ d > 0 ∀ d λ (1) = 1 ∀ d , W (1) 2: initialization: = I ∀ d d d 3: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) d � W ( t ) γ � y − Φ x � 2 4: 2 + d Ψ d x � 1 x d =1 �� − 1 � L d 1 + | ψ T d,l x ( t ) | 1 � λ ( t +1) � ← log + 1 , d = 1 , ..., D 5: d L d ǫ d l =1 � � 1 1 W ( t +1) ← diag , · · · , , d = 1 , ..., D, 6: d ǫ d + | ψ T ǫ d + | ψ T d, 1 x ( t ) | d,L d x ( t ) | 7: end 8: output: x ( t ) Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 16 / 29
Co-IRW-L1 and its Interpretations Optimization Interpretations of real-Co-IRW-L1- ǫ Real-Co-IRW-L1- ǫ is an MM approach to the non-convex optimization � D L d � L d 1 + | ψ T ��� d,i x | �� � � � � γ � y − Φ x � 2 ǫ d + | ψ T arg min 2 + log d,l x | log ǫ d x d =1 l =1 i =1 and As ǫ d → 0 , real-Co-IRW-L1- ǫ aims to solve the ℓ 0 + weighted ℓ 0 , 0 prob- lem � D � � γ � y − Φ x � 2 arg min 2 + � Ψ x � 0 + L d 1 � Ψ d x � 0 > 0 x d =1 Note: L d is the size of dictionary Ψ d , and 1 � is the indicator function. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 17 / 29
Recommend
More recommend