on the use of gaussian models on patches for image
play

On the use of Gaussian models on patches for image denoising - PowerPoint PPT Presentation

On the use of Gaussian models on patches for image denoising Antoine Houdard Young Researchers in Imaging Seminars Institut Henri Poincar Wednesday, February 27th Digital photography: noise in images Different ISO settings with constant


  1. On the use of Gaussian models on patches for image denoising Antoine Houdard Young Researchers in Imaging Seminars Institut Henri Poincaré Wednesday, February 27th

  2. Digital photography: noise in images Different ISO settings with constant exposure – 25600 ISO 2/48

  3. Digital photography: noise in images Different ISO settings with constant exposure – 200 ISO 2/48

  4. Noise modeling and denoising problem 3/48

  5. Patch-based image denoising � Many denoising methods rely on the description of the image by patches: ‹ NL-means Buades, Coll, Morel (2005), ‹ BM3D Dabov, Foi, Katkovnik (2007), ‹ PLE Yu, Sapiro, Mallat (2012), ‹ NL-Bayes Lebrun, Buades, Morel (2012), ‹ LDMM Shi, Osher, Zhu (2017), ‹ and many others... 4/48

  6. Patch-based image denoising Hypothesis: the N i are i.i.d. 5/48

  7. Patch-based image denoising The Bayesian paradigm ‹ We consider each clean patch x as a realization of a random vector X with prior distribution P X . Ñ The Gaussian white noise model rewrites: , then Bayes’ theorem yields the posterior distribution: P X | Y p x | y q “ P Y | X p y | x q P X p x q . P Y p y q 6/48

  8. Patch-based image denoising Denoising strategies � p x “ E r X | Y “ y s the minimum mean square error (MMSE) estimator � p x “ Dy ` α s.t. D and α minimize E r} DY ` α ´ X } 2 s which is the linear MMSE also called Wiener estimator � p x “ arg max x P R p p p x | y q the maximum a posteriori (MAP) 7/48

  9. Outline 1. Gaussian priors for X : why are they widely used? 2. How to infer parameters in high dimension? 3. Presentation of the HDMI method. 4. Limitations of model-based patch-based approaches. 8/48

  10. 1. Modeling the clean patches X i 9/48

  11. Choice of the model In the literature � local Gaussian models ‹ patch-based PCA Deledalle, Salmon, Dalalyan (2011), ‹ NL-bayes Lebrun, Buades, Morel (2012), ‹ ... � Gaussian mixture models ‹ EPLL Zoran, Weiss (2011), ‹ PLE Yu, Sapiro, Mallat (2012), ‹ Single-frame Image Denoising Teodoro, Almeida, Figueiredo (2015). ‹ ... Why Gaussian models are so widely used? 10/48

  12. Gaussian is convenient � Gaussian model If X „ N p µ, Σ q then x MAP “ µ ` Σ p Σ ` σ 2 I q ´ 1 p y ´ µ q . p x MMSE “ p x Wiener “ p � Gaussian mixture model (GMM) If X „ ř K k “ 1 π k N p µ k , Σ k q then ÿ K “ ‰ µ k ` Σ k p Σ k ` σ 2 I q ´ 1 p y ´ µ k q p x MMSE “ P p Z “ k | Y “ y q . k “ 1 11/48

  13. What do Gaussian models encode? The covariance matrix in Gaussian models and GMM encodes geometric structures up to some contrast change: s s ˆ s Patches generated from N p m, Σ q . Covariance matrix Σ . 12/48

  14. What do Gaussian models encode? The covariance matrix in Gaussian models and GMM encodes geometric structures up to some contrast change: s s ˆ s Patches generated from N p m, Σ q . Covariance matrix Σ . 12/48

  15. What do Gaussian models encode? A covariance matrix cannot encode multiple translated versions of a structure: A set of 10000 patches representing edges with random grey levels and random translations. 13/48

  16. What do Gaussian models encode? A covariance matrix cannot encode multiple translated versions of a structure: s s ˆ s Patches generated from N p m, Σ q . Covariance matrix Σ . 13/48

  17. Restore with the right model covariance matrix clean patch noisy patch denoised 14/48

  18. Conclusion Modeling the patches with Gaussian models is a good idea: � They are convenient for computing the estimates; � They are able to encode the geometric structures of the patches. Need of good parameters for the model! 15/48

  19. 2. How to infer parameters in high dimension? 16/48

  20. Parameters inference Gaussian model case: X „ N p µ X , Σ X q observed data t y 1 , . . . , y n u sampled from Y “ X ` N „ N p µ Y , Σ Y q . The maximization of the likelihood ÿ n L p y ; θ q “ 1 p y ´ µ Y q T Σ Y ´ 1 p y ´ µ Y q , 2 i “ 1 yields the Maximum Likelihood estimators (MLE) ÿ n ÿ n µ Y “ 1 Σ Y “ 1 µ Y q T p y i ´ p p p p y i ´ p y i , µ Y q . n n i “ 1 i “ 1 Since Σ Y “ Σ X ` σ 2 I p , it yields Σ X “ p p Σ Y ´ σ 2 I p . p µ X “ p µ Y , 17/48

  21. How to group patches? Need to group the patches representing the same structure together � For instance with } ¨ } 2 Ñ not robust for strong noise: � Gaussian Mixture Models naturally provide a (more robust) grouping! 18/48

  22. Parameters inference Gaussian Mixture Model case: X „ ř π k N p µ k , Σ k q This implies a GMM on the noisy patches Y „ ř π k N p µ k , S k q EM algorithm: maximize the conditional expectation of the complete log-likelihood: ÿ K ÿ n t ik log p π k g p y i ; θ k qq , k “ 1 i “ 1 where t ik “ E r Z “ k | y i , θ ˚ s and θ ˚ a given set of parameters. � E-step estimation of t ik knowing the current parameters � M-step compute maximum likelihood estimators (MLE) for parameters: ÿ ÿ µ k “ 1 S k “ 1 π k “ n k t ik p y i ´ µ k qp y i ´ µ k q T , p p p t ik y i , n , n k n k with n k “ ř i i i t ik . 19/48

  23. Sketch of a denoising algorithm With all these ingredients, we can design a denoising algorithm: � Extract the patches from the image with P i operators � Learn a GMM for the clean patches X from the observations of Y � Denoise each patch with the MMSE � Aggregate all the denoised patches with the P T operators i 20/48

  24. Sketch of a denoising algorithm With all these ingredients, we can design a denoising algorithm: � Extract the patches from the image with P i operators � Learn a GMM for the clean patches X from the observations of Y � Denoise each patch with the MMSE � Aggregate all the denoised patches with the P T operators i But... 20/48

  25. The curse of dimensionality Parameter estimation for Gaussian models or GMMs suffers from the curse of dimensionality The number of samples needed for the estimation of a parameter grows exponentially with the dimension 21/48

  26. The curse of dimensionality in patches space We consider patches of size p “ 10 ˆ 10 Ñ High dimension. Ñ the estimation of sample covariance matrices is difficult: ill conditioned, singular... 22/48

  27. The curse of dimensionality in patches space We consider patches of size p “ 10 ˆ 10 Ñ High dimension. Ñ the estimation of sample covariance matrices is difficult: ill conditioned, singular... In the literature , this issue is generally worked around by � the use of small patches ( 3 ˆ 3 or 5 ˆ 5 ) NL-Bayes [Lebrun, Buades, Morel] � adding ε I to singular covariance matrices PLE [Yu, Sapiro, Mallat] � fixing a lower dimension for covariance matrices S-PLE [Wang, Morel] But, there is no reason to be afraid of this curse! 22/48

  28. The bless of dimensionality? In high-dimensional spaces, it is easier to separate data: Many patches represent structures that live locally in a low dimensional space: using this latent lower dimension allows to group the patches in a more robust way. This “bless” is used in clustering algorithms designed for high-dimension High-Dimensional Data Clustering [Bouveyron, Girard, Schmid] 2007 23/48

  29. The bless of dimensionality? An illustration in the context of patches: an image made of vertical stripes of width >2 pixels with random grey levels. 24/48

  30. The bless of dimensionality? An illustration in the context of patches: view 1 view 2 In the patch space, we cannot distinguish three classes 24/48

  31. The bless of dimensionality? An illustration in the context of patches: view 1 of the first 3 pixels view 2 of the first 3 pixels The algorithm is now able to separate these classes! 24/48

  32. 3. High-Dimensional Mixture Models for Image Denoising 25/48

  33. HDMI: presentation of the model � model the clean patches X ` Z latent random variable indicating group membership ` X lives in a low-dimensional subspace which is specific to its latent group: X | Z “ k „ N p µ k , U k Λ k U T k q where U k is a p ˆ d k orthogonal matrix and Λ k “ diag p λ k 1 , . . . , λ k d k q a diagonal matrix of size d k ˆ d k . 26/48

  34. HDMI: induced model � Induced model on the noisy patches Y The model on X implies that Y follows a full rank GMM ÿ K p p y q “ π k g p y ; µ k , Σ k q k “ 1 where U k Σ k U t k has the specific structure: ¨ ˛ , a k 1 0 . ˚ ‹ ... ˚ 0 ‹ d k - ˚ ‹ ˚ ‹ 0 a kd ˚ ‹ , ˚ ‹ ˚ ‹ / / ˚ ‹ . σ 2 0 ˚ ‹ ˚ ‹ p p ´ d k q 0 ... ˝ ‚ / / - σ 2 0 j ` σ 2 and a kj ą σ 2 , for j “ 1 , . . . , d k . where a kj “ λ k 27/48

  35. Denoising with the HDMI model The HDMI model being known, each patch is denoised with the MMSE ÿ K p x i “ E r X | Y “ y i s “ t ik ψ k p y i q k “ 1 where t ik is the posterior probability for the patch y i to belong in the k -th group and ¨ ˛ a k 1 ´ σ 2 0 a k 1 ˚ ‹ ... ˚ ‹ ‚ U T ψ k p y i q “ µ k ` U k k p y i ´ µ k q . ˝ a kdk ´ σ 2 0 a kdk 28/48

  36. Model inference with an EM algorithm, the parameters are updated during the M-step : � p U k is formed by the d k first eigenvectors of the sample covariance matrix � p a kj is the j -th eigenvalue of the sample covariance matrix 29/48

Recommend


More recommend