On the use of Gaussian models on patches for image denoising - PowerPoint PPT Presentation

On the use of Gaussian models on patches for image denoising Antoine Houdard Young Researchers in Imaging Seminars Institut Henri Poincaré Wednesday, February 27th

Digital photography: noise in images Different ISO settings with constant exposure – 25600 ISO 2/48

Digital photography: noise in images Different ISO settings with constant exposure – 200 ISO 2/48

Noise modeling and denoising problem 3/48

Patch-based image denoising � Many denoising methods rely on the description of the image by patches: ‹ NL-means Buades, Coll, Morel (2005), ‹ BM3D Dabov, Foi, Katkovnik (2007), ‹ PLE Yu, Sapiro, Mallat (2012), ‹ NL-Bayes Lebrun, Buades, Morel (2012), ‹ LDMM Shi, Osher, Zhu (2017), ‹ and many others... 4/48

Patch-based image denoising Hypothesis: the N i are i.i.d. 5/48

Patch-based image denoising The Bayesian paradigm ‹ We consider each clean patch x as a realization of a random vector X with prior distribution P X . Ñ The Gaussian white noise model rewrites: , then Bayes’ theorem yields the posterior distribution: P X | Y p x | y q “ P Y | X p y | x q P X p x q . P Y p y q 6/48

Patch-based image denoising Denoising strategies � p x “ E r X | Y “ y s the minimum mean square error (MMSE) estimator � p x “ Dy ` α s.t. D and α minimize E r} DY ` α ´ X } 2 s which is the linear MMSE also called Wiener estimator � p x “ arg max x P R p p p x | y q the maximum a posteriori (MAP) 7/48

Outline 1. Gaussian priors for X : why are they widely used? 2. How to infer parameters in high dimension? 3. Presentation of the HDMI method. 4. Limitations of model-based patch-based approaches. 8/48

1. Modeling the clean patches X i 9/48

Choice of the model In the literature � local Gaussian models ‹ patch-based PCA Deledalle, Salmon, Dalalyan (2011), ‹ NL-bayes Lebrun, Buades, Morel (2012), ‹ ... � Gaussian mixture models ‹ EPLL Zoran, Weiss (2011), ‹ PLE Yu, Sapiro, Mallat (2012), ‹ Single-frame Image Denoising Teodoro, Almeida, Figueiredo (2015). ‹ ... Why Gaussian models are so widely used? 10/48

Gaussian is convenient � Gaussian model If X „ N p µ, Σ q then x MAP “ µ ` Σ p Σ ` σ 2 I q ´ 1 p y ´ µ q . p x MMSE “ p x Wiener “ p � Gaussian mixture model (GMM) If X „ ř K k “ 1 π k N p µ k , Σ k q then ÿ K “ ‰ µ k ` Σ k p Σ k ` σ 2 I q ´ 1 p y ´ µ k q p x MMSE “ P p Z “ k | Y “ y q . k “ 1 11/48

What do Gaussian models encode? The covariance matrix in Gaussian models and GMM encodes geometric structures up to some contrast change: s s ˆ s Patches generated from N p m, Σ q . Covariance matrix Σ . 12/48

What do Gaussian models encode? A covariance matrix cannot encode multiple translated versions of a structure: A set of 10000 patches representing edges with random grey levels and random translations. 13/48

What do Gaussian models encode? A covariance matrix cannot encode multiple translated versions of a structure: s s ˆ s Patches generated from N p m, Σ q . Covariance matrix Σ . 13/48

Restore with the right model covariance matrix clean patch noisy patch denoised 14/48

Conclusion Modeling the patches with Gaussian models is a good idea: � They are convenient for computing the estimates; � They are able to encode the geometric structures of the patches. Need of good parameters for the model! 15/48

2. How to infer parameters in high dimension? 16/48

Parameters inference Gaussian model case: X „ N p µ X , Σ X q observed data t y 1 , . . . , y n u sampled from Y “ X ` N „ N p µ Y , Σ Y q . The maximization of the likelihood ÿ n L p y ; θ q “ 1 p y ´ µ Y q T Σ Y ´ 1 p y ´ µ Y q , 2 i “ 1 yields the Maximum Likelihood estimators (MLE) ÿ n ÿ n µ Y “ 1 Σ Y “ 1 µ Y q T p y i ´ p p p p y i ´ p y i , µ Y q . n n i “ 1 i “ 1 Since Σ Y “ Σ X ` σ 2 I p , it yields Σ X “ p p Σ Y ´ σ 2 I p . p µ X “ p µ Y , 17/48

How to group patches? Need to group the patches representing the same structure together � For instance with } ¨ } 2 Ñ not robust for strong noise: � Gaussian Mixture Models naturally provide a (more robust) grouping! 18/48

Parameters inference Gaussian Mixture Model case: X „ ř π k N p µ k , Σ k q This implies a GMM on the noisy patches Y „ ř π k N p µ k , S k q EM algorithm: maximize the conditional expectation of the complete log-likelihood: ÿ K ÿ n t ik log p π k g p y i ; θ k qq , k “ 1 i “ 1 where t ik “ E r Z “ k | y i , θ ˚ s and θ ˚ a given set of parameters. � E-step estimation of t ik knowing the current parameters � M-step compute maximum likelihood estimators (MLE) for parameters: ÿ ÿ µ k “ 1 S k “ 1 π k “ n k t ik p y i ´ µ k qp y i ´ µ k q T , p p p t ik y i , n , n k n k with n k “ ř i i i t ik . 19/48

Sketch of a denoising algorithm With all these ingredients, we can design a denoising algorithm: � Extract the patches from the image with P i operators � Learn a GMM for the clean patches X from the observations of Y � Denoise each patch with the MMSE � Aggregate all the denoised patches with the P T operators i 20/48

Sketch of a denoising algorithm With all these ingredients, we can design a denoising algorithm: � Extract the patches from the image with P i operators � Learn a GMM for the clean patches X from the observations of Y � Denoise each patch with the MMSE � Aggregate all the denoised patches with the P T operators i But... 20/48

The curse of dimensionality Parameter estimation for Gaussian models or GMMs suffers from the curse of dimensionality The number of samples needed for the estimation of a parameter grows exponentially with the dimension 21/48

The curse of dimensionality in patches space We consider patches of size p “ 10 ˆ 10 Ñ High dimension. Ñ the estimation of sample covariance matrices is difficult: ill conditioned, singular... 22/48

The curse of dimensionality in patches space We consider patches of size p “ 10 ˆ 10 Ñ High dimension. Ñ the estimation of sample covariance matrices is difficult: ill conditioned, singular... In the literature , this issue is generally worked around by � the use of small patches ( 3 ˆ 3 or 5 ˆ 5 ) NL-Bayes [Lebrun, Buades, Morel] � adding ε I to singular covariance matrices PLE [Yu, Sapiro, Mallat] � fixing a lower dimension for covariance matrices S-PLE [Wang, Morel] But, there is no reason to be afraid of this curse! 22/48

The bless of dimensionality? In high-dimensional spaces, it is easier to separate data: Many patches represent structures that live locally in a low dimensional space: using this latent lower dimension allows to group the patches in a more robust way. This “bless” is used in clustering algorithms designed for high-dimension High-Dimensional Data Clustering [Bouveyron, Girard, Schmid] 2007 23/48

The bless of dimensionality? An illustration in the context of patches: an image made of vertical stripes of width >2 pixels with random grey levels. 24/48

The bless of dimensionality? An illustration in the context of patches: view 1 view 2 In the patch space, we cannot distinguish three classes 24/48

The bless of dimensionality? An illustration in the context of patches: view 1 of the first 3 pixels view 2 of the first 3 pixels The algorithm is now able to separate these classes! 24/48

3. High-Dimensional Mixture Models for Image Denoising 25/48

HDMI: presentation of the model � model the clean patches X ` Z latent random variable indicating group membership ` X lives in a low-dimensional subspace which is specific to its latent group: X | Z “ k „ N p µ k , U k Λ k U T k q where U k is a p ˆ d k orthogonal matrix and Λ k “ diag p λ k 1 , . . . , λ k d k q a diagonal matrix of size d k ˆ d k . 26/48

HDMI: induced model � Induced model on the noisy patches Y The model on X implies that Y follows a full rank GMM ÿ K p p y q “ π k g p y ; µ k , Σ k q k “ 1 where U k Σ k U t k has the specific structure: ¨ ˛ , a k 1 0 . ˚ ‹ ... ˚ 0 ‹ d k - ˚ ‹ ˚ ‹ 0 a kd ˚ ‹ , ˚ ‹ ˚ ‹ / / ˚ ‹ . σ 2 0 ˚ ‹ ˚ ‹ p p ´ d k q 0 ... ˝ ‚ / / - σ 2 0 j ` σ 2 and a kj ą σ 2 , for j “ 1 , . . . , d k . where a kj “ λ k 27/48

Denoising with the HDMI model The HDMI model being known, each patch is denoised with the MMSE ÿ K p x i “ E r X | Y “ y i s “ t ik ψ k p y i q k “ 1 where t ik is the posterior probability for the patch y i to belong in the k -th group and ¨ ˛ a k 1 ´ σ 2 0 a k 1 ˚ ‹ ... ˚ ‹ ‚ U T ψ k p y i q “ µ k ` U k k p y i ´ µ k q . ˝ a kdk ´ σ 2 0 a kdk 28/48

Model inference with an EM algorithm, the parameters are updated during the M-step : � p U k is formed by the d k first eigenvectors of the sample covariance matrix � p a kj is the j -th eigenvalue of the sample covariance matrix 29/48

On the use of Gaussian models on patches for image denoising - PowerPoint PPT Presentation

On the use of Gaussian models on patches for image denoising Antoine Houdard Young Researchers in Imaging Seminars Institut Henri Poincar Wednesday, February 27th Digital photography: noise in images Different ISO settings with constant

Topic 4: Topic 4: Local analysis of image Local analysis of image patches patches patches

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

How to use Gaussian mixture models on patches for solving image inverse problems Workshop

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Automatically Automatically Finding Patches Finding Patches Using Genetic Using Genetic

Overlapping Patches for Dynamic Surface Problems C. Carlo Fazioli Drexel University 11 Jan 2014

CS 418: Interactive Computer Graphics Bezier Patches Eric Shaffer Some material taken from The

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Outline Evaluating Models of Natural Image Patches Evaluating Models Comparing Whitening

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

PDEs on the Space of Patches for Image Denoising and Registration David Tschumperl Luc

Random Graph Models for Image Patches Franois Meyer University of Colorado at Boulder Joint

Detecting client-side e-banking fraud using a heuristic model Tim Timmermans Jurgen Kloosterman

ProbabilisticModeling and JointDistributionModel Probabilistic Modeling / Joint

EnKF and filter divergence David Kelly Andrew Stuart Kody Law Courant Institute New York

CITES-2019 Bayesian approach to the data assimilation problem based on the use of ensembles of

Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann

scientific ideas from Taiwan Y en - Ting Lin Institute of Astronomy & Astrophysics, Academia

Entropy-minimizing Mechanism for Differential Privacy of Discrete-time Linear Feedback Systems

Allocating Resources, in the Future Sid Banerjee School of ORIE May 3, 2018 Simons Workshop on

On the use of Gaussian models on patches for image denoising - PowerPoint PPT Presentation

On the use of Gaussian models on patches for image denoising Antoine Houdard Young Researchers in Imaging Seminars Institut Henri Poincar Wednesday, February 27th Digital photography: noise in images Different ISO settings with constant

Topic 4: Topic 4: Local analysis of image Local analysis of image patches patches patches

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

How to use Gaussian mixture models on patches for solving image inverse problems Workshop

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Automatically Automatically Finding Patches Finding Patches Using Genetic Using Genetic

Overlapping Patches for Dynamic Surface Problems C. Carlo Fazioli Drexel University 11 Jan 2014

CS 418: Interactive Computer Graphics Bezier Patches Eric Shaffer Some material taken from The

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

Outline Evaluating Models of Natural Image Patches Evaluating Models Comparing Whitening

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

PDEs on the Space of Patches for Image Denoising and Registration David Tschumperl Luc

Random Graph Models for Image Patches Franois Meyer University of Colorado at Boulder Joint

Detecting client-side e-banking fraud using a heuristic model Tim Timmermans Jurgen Kloosterman

ProbabilisticModeling and JointDistributionModel Probabilistic Modeling / Joint

EnKF and filter divergence David Kelly Andrew Stuart Kody Law Courant Institute New York

CITES-2019 Bayesian approach to the data assimilation problem based on the use of ensembles of

Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann

scientific ideas from Taiwan Y en - Ting Lin Institute of Astronomy &amp; Astrophysics, Academia

Entropy-minimizing Mechanism for Differential Privacy of Discrete-time Linear Feedback Systems

Allocating Resources, in the Future Sid Banerjee School of ORIE May 3, 2018 Simons Workshop on

scientific ideas from Taiwan Y en - Ting Lin Institute of Astronomy & Astrophysics, Academia