LIRMM, Universit´ e de Montpellier S´ eminaire du 18 janvier 2018 On modeling image patch distribution for image restoration Charles Deledalle Joint work with: Shibin Parameswaran (UCSD/SPAWAR) Lo¨ ıc Denis (T´ el´ ecom Saint Etienne) Truong Nguyen (UCSD). Institut de Math´ ematiques de Bordeaux, CNRS-Universit´ e Bordeaux, France Department of Electrical and Computer Engineering, University of California, San Diego (UCSD), USA 1
Introduction In many scenarios, one cannot get a perfect clean pictures of a scene: • Camera shake • Motion • Objects out-of-focus • Low-light conditions. In many applications, images are noisy, blurry, sub-sampled, compressed, etc: • Microscopy • Astronomy • Remote sensing • Medical • Sonar. Automatic image restoration algorithms are needed. Fast computation is required to process large image data-sets. 2
Introduction – Inverse problems Model y = A x + w • y ∈ R M observed degraded image (with M pixels) • x ∈ R N unknown underlying “clean” image (with N pixels) • w ∼ N (0 , σ 2 Id M ) noise component (standard deviation σ ) • A : R N → R M : linear operator (blur, missing pixels, random projections) Deconvolution subject to noise = ⋆ + � �� � � �� � � �� � � �� � Blur A y x w Goal: Retrieve the sharp and clean image x from y 3
Introduction – Inverse problems Linear least square estimator 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 x One solution is the Moore-Penrose pseudo inverse: ε → 0 ( A t A + ε Id N ) − 1 A t y x = A + y = lim ˆ Example (Deconvolution) A = F − 1 Φ F : circulant matrix F : Fourier transform Φ = diag( φ 1 , . . . , φ N ) : blur Fourier coefficients Linear least square solution � φ ∗ i c i | φ i | > 0 if x = F − 1 ˆ | φ i | 2 c = F y ˆ c with c i = ˆ and 0 otherwise 4
Introduction – Inverse problems Linear least square estimator 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 x One solution is the Moore-Penrose pseudo inverse: ε → 0 ( A t A + ε Id N ) − 1 A t y x = A + y = lim ˆ Example (Deconvolution) x = F − 1 ˆ (a) Observation y (b) c = F y (c) ˆ (d) ˆ c c 5
Motivations – Variational models Variational model: Regularized linear least-square 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 + R ( x ) x Example (Maximum A Posteriori (MAP)) 1 | 2 2 σ 2 | |A x − y | = − log p ( y | x ) (likelihood for Gaussian noises) 2 = − log p ( x ) R ( x ) (a priori) What prior? 6
Motivations – Variational models Variational model: Regularized linear least-square 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 + R ( x ) x Example (Maximum A Posteriori (MAP)) 1 | 2 2 σ 2 | |A x − y | = − log p ( y | x ) (likelihood for Gaussian noises) 2 = − log p ( x ) R ( x ) (a priori) What about a Gaussian prior? 7
Motivations – Variational models Variational model: Regularized linear least-square 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 + R ( x ) x Example (Wiener deconvolution / Tikhonov regularization) � c i � 2 � | Λ − 1 / 2 F | 2 R ( x ) = | x | 2 = with c = F x λ i � �� � i Γ Λ = diag( λ 2 1 , . . . , λ 2 mean power spectral density ( λ i ≈ β | ω i,j | − α ) N ) : x = ( A t A + σ 2 Γ t Γ) − 1 A t y ˆ Solution is linear: φ ∗ x = F − 1 ˆ (a) y (b) c = F y (c) i (d) ˆ (e) ˆ c c | φi | 2+ σ 2 /λ 2 i 8
Motivations – Variational models Variational model: Regularized linear least-square 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 + R ( x ) x Example (Wavelet shrinkage/thresholding) | c i | � | Λ − 1 / 2 W x | R ( x ) = | | 1 = with c = W x λ i i W : Wavelet transform or Frame ( W + W = Id N ) Λ = diag( λ 2 1 , . . . , λ 2 energy for each sub-band ( λ i ≈ C 2 j i ) N ) : Solution is non-linear, sparse and non-explicit (requires an iterative solver): (a) y (b) ˆ x 9
Motivations – Variational models Variational model: Regularized linear least-square 1 | 2 x ∈ argmin ˆ 2 σ 2 | |A x − y | 2 + R ( x ) x Example (Total-Variation (Rudin et al. , 1992)) R ( x ) = 1 | 12 = 1 � � | x i +1 ,j − x ij | 2 + | x i,j +1 − x ij | 2 , λ | |∇ x | λ i,j ∇ : gradient – horizontal and vertical forward finite difference λ > 0 : regularization parameter (difficult to tune) Solution is again non-linear and non-explicit (requires an iterative solver): (a) Blurry (b) Tiny λ (c) Small λ (d) Medium λ (e) Huge λ 10
Motivations – Patch priors • Modeling the distribution of images is difficult. • Learning this distribution as well (curse of dimensionality). • Images lie on a complex and large dimensional manifold. • Their distribution may be spread out on different clusters. Divide and conquer approach: Break down images into small patches and model their distribution. N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 All reconstructed overlapping patches must be well explained by the prior. P i : R N → R P extracts a patch with P pixels centered at location i . Linear operator. Typically, P = 8 × 8 . 11
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (Fields of Experts, Roth et al. , 2005) • R ( z ) = � K � 2 � φ k , z � 2 � , α k > 0 , φ k ∈ R P a high-pass filter. 1 + 1 k =1 α k log • K Student-t experts parametrized by α k and φ k . • Learned by maximum likelihood with MCMC. 12
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (Analysis k-SVD, Rubinstein et al. , 2013) • R ( z ) = 1 λ | | Γ z | | 0 = # { c i � = 0 } with c = Γ z • | | · | | 0 : ℓ 0 pseudo-norm promoting sparsity. • Γ ∈ R Q × P learned from a large collection of clean patches. • Patches distributed on an union of sub-spaces (clusters). 13
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (Gaussian Mixture Model priors, Yu et al. , 2010) z = 1 P 1 P 1 t R ( z ) = − log p ( z − ¯ z ) with ¯ P z K � � 1 − 1 � 2 z t Σ − 1 and p ( z ) = w k (2 π ) P/ 2 | Σ k | 1 / 2 exp , k z k =1 • K : number of Gaussians (clusters) • w k : weights � k w k = 1 (frequency of each clusters) • Σ k : P × P covariance matrix (shape of cluster) • Zero mean assumption (contrast invariance) Least square + GMM Patch Prior = Expected Patch Log Likelihood (EPLL) 14
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (EPLL, Zoran & Weiss, 2011) z = 1 P 1 P 1 t R ( z ) = − log p ( z − ¯ z ) with ¯ P z K � � 1 − 1 � 2 z t Σ − 1 and p ( z ) = w k (2 π ) P/ 2 | Σ k | 1 / 2 exp , k z k =1 ( w k , Σ k ) learned by EM on 2 million patches. Patch size: P = 8 × 8 #Gaussians: K = 200 � �� � 100 randomly generated patches from the learned model 15
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (EPLL, Zoran & Weiss, 2011) Noise with standard-deviation σ = 20 (images in range [0 , 255] ) 22 . 1 / . 368 30 . 2 / . 862 (a) Reference x (b) Noisy image y (c) EPLL result ˆ x 16
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (EPLL, Zoran & Weiss, 2011) Motion blur subject to noise with standard-deviation σ = . 5 24 . 9 / . 624 32 . 7 / . 924 (a) Reference x / Blur kernel (b) Blurry image y (c) EPLL result ˆ x 17
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (EPLL, Zoran & Weiss, 2011) Pros: • Near state-of-the-art results in denoising, super-resolution, in-painting. . . • No regularization parameter to tune per image-degradation pair. • Only parameters: the patch size P and the number of components K . • Multi-scale adaptation is straightforward (Papyan & Elad, 2016). Cons: • Non-convex optimization problem • Original solver is very slow • Some Gibbs artifacts/oscillations can be observed 18
Motivations – Patch priors Regularized linear least-square with patch priors N P � | 2 x ∈ argmin 2 σ 2 | |A x − y | R ( P i x ) ˆ 2 + x i =1 Example (EPLL, Zoran & Weiss, 2011) Pros: • Near state-of-the-art results in denoising, super-resolution, in-painting. . . • No regularization parameter to tune per image-degradation pair. • Only parameters: the patch size P and the number of components K . • Multi-scale adaptation is straightforward (Papyan & Elad, 2016). Cons: • Non-convex optimization problem . . . . . . . . . . . . . EPLL Algorithm (Part 1) • Original solver is very slow . . . . . . . . . . . . . . . . . . . . . . . . . Fast EPLL (Part 2) • Some Gibbs artifacts/oscillations can be observed . . . . . .GGMMs (Part 3) 19
Recommend
More recommend