How to use Gaussian mixture models on patches for solving image - PowerPoint PPT Presentation

How to use Gaussian mixture models on patches for solving image inverse problems Workshop MixStatSeq Antoine Houdard LTCI, Télécom ParisTech MAP5, Université Paris Descartes antoine.houdard@telecom-paristech.fr houdard.wp.imt.fr Joint work with C. Bouveyron & J. Delon 1 / 29

Image restoration : solving an inverse problem � Image restoration problem : find the clean image u from the observed degraded image v s.t. v = Φ u + ǫ, with Φ degradation operator and ǫ additive noise. � Gaussian white noise case : Here we deal with the simpler problem Φ = I and ǫ ∼ N (0 , σ 2 I ) 2 / 29

Patch-based image denoising � most of the denoising methods rely on the description of the image by patches (NL-means, NL-Bayes, S-PLE, LDMM, PLE, BM3D, DA3D) « Les patchs sont aux images ce que les phonèmes sont à la chaîne parlée. » Pattern Theory, Desolneux & Mumford 3 / 29

Patch-based image denoising the statistical framework � We consider each clean patch x i as a realization of a random vector X i with some prior distribution P X � the Gaussian white noise model for patches yields with N i ∼ N (0 , I p ) . � Hypothesis : N i and X i are independent and the N i ’s are i.i.d. � so we can write the posterior distribution with Bayes’ theorem P X | Y ( x | y ) = P Y | X ( y | x ) P X ( x ) . P Y ( y ) 4 / 29

Patch-based image denoising denoising strategies Denoising strategies � � x = E [ X | Y = y ] the minimum mean square error (MMSE) estimator � � x = Dy + α s.t. D and α minimize E [ � DY + α − X � 2 ] which is the linear MMSE also called Wiener estimator � � x = arg max x ∈ R p p ( x | y ) the maximum a posteriori (MAP) 5 / 29

Patch-based image denoising choice and inference of the model In the literature � local Gaussian models [NL-bayes] � Gaussian mixture models (GMM) [PLE, S-PLE, EPLL] Advantages of Gaussian models and GMM � able to encode information of the patches � make computation of estimators easy 6 / 29

Patch-based image denoising Gaussian and GMM models The covariance matrix in Gaussian models and GMM is able to encode geometric structure in patches : Left : Covariance matrix Σ . Right : patches generated from the Gaussian model N (0 , Σ) . 7 / 29

Restore with the right model covariance matrix clean patch noisy patch denoised 8 / 29

Patch-based image denoising summary of the framework 9 / 29

The curse of dimensionality Parameters estimation for Gaussian models or GMMs suffers from the curse of dimensionality This term curse was first used by R. Bellman in the introduction of his book “Dynamic programming” in 1957 : All [problems due to high dimension] may be subsumed under the heading “the curse of dimensionality”. Since this is a curse, [...] , there is no need to feel discouraged about the possibility of obtaining significant results despite it. 10 / 29

The curse of dimensionality High-dimensional spaces are empty In high-dimensional space no one can hear you scream ! 11 / 29

The curse of dimensionality High-dimensional spaces are empty Neighborhoods are no more local ! Data are isolated 11 / 29

The curse of dimensionality In patches space We consider patches of size p = 10 × 10 → High dimension. → the estimation of sample covariance matrices is difficult : ill conditioned, singular... 12 / 29

The curse of dimensionality In patches space We consider patches of size p = 10 × 10 → High dimension. → the estimation of sample covariance matrices is difficult : ill conditioned, singular... In the literature , this issue is worked around by � the use of small patches in NL-Bayes ( 3 × 3 or 5 × 5 ) � a model of mixture with fixed lower dimensions covariances in S-PLE We propose a fully statistical model, that estimates a lower dimension for each group. 12 / 29

Reminder : Noise model and notations We denote � { y 1 , . . . , y n } ∈ R p the (observed) noisy patches of the image ; � { x 1 , . . . , x n } ∈ R p the corresponding (unobserved) clean patches. We suppose they are realizations of random variables Y and X that follow the classical degradation model : + = 13 / 29

Reminder : Noise model and notations We denote � { y 1 , . . . , y n } ∈ R p the (observed) noisy patches of the image ; � { x 1 , . . . , x n } ∈ R p the corresponding (unobserved) clean patches. We suppose they are realizations of random variables Y and X that follow the classical degradation model : + = We design for X the High-Dimensional Mixture Model for Image Denoising (HDMI) 13 / 29

The HDMI model � Model on the actual patches X . Let Z be the latent random variable indicating the group from which the patch X has been generated. We assume that X lives in a low-dimensional subspace which is specific to its latent group : X | Z = k = U k T + µ k , where U k is a p × d k orthonormal transformation matrix and T ∈ R d k such that T | Z = k ∼ N (0 , Λ k ) , with Λ k = diag( λ k 1 , . . . , λ k d k ) . � Model on the noisy patches. This implies that Y follow � K p ( y ) = π k g ( y ; µ k , Σ k ) k =1 where π k is the mixture proportion for the k th component and Σ k = U k Λ k U T k + σ 2 I p . 14 / 29

The HDMI model The projection of the covariance matrix ∆ k = Q k Σ k Q t k has the specific structure :    a k 1 0    ...  0  d k      0 a kd      ∆ k =        σ 2 0     ( p − d k ) 0 ...      0 σ 2 j + σ 2 and a kj > σ 2 , for j = 1 , . . . , d k . where a kj = λ k 15 / 29

The HDMI model Q k π σ 2 Z N X a k 1 , ..., a kd k µ k , d k T Y Figure – Graphical representation of the HDMI model. 16 / 29

Denoising with the HDMI model The HDMI model being known, each patch is denoised with the MMSE estimator x i = E [ X | Y = y i ] , � which can be computed as follow : Proposition. � K E [ X | Y = y i ] = ψ k ( y i ) t ik , k =1 with t ik the posterior probability for the patch y i to belong in the k th group and   a k 1 − σ 2 0 a k 1   ...    U T ψ k ( y i ) = µ k + U k k ( y i − µ k ) ,  a kdk − σ 2 0 a kdk 17 / 29

Model inference EM algorithm : maximize w.r.t. θ the conditional expectation of the complete log-likelihood : � K � n def Ψ( θ, θ ∗ ) = t ik log ( π k g ( y i ; θ k )) , k =1 i =1 where t ik = E [ z = k | y i , θ ∗ ] and θ ∗ a given set of parameters. � E-step estimation of t ik knowing the current parameters � M-step compute maximum likelihood estimators (MLE) for parameters : � � µ k = 1 S k = 1 π k = n k t ik ( y i − µ k )( y i − µ k ) T , � � t ik y i , � n , n k n k i i with n k = � i t ik . Then � Q k is formed by the d k first eigenvectors of � S k a kj is the j th eigenvalue of � and � S k . 18 / 29

Model inference The hyper-parameters The hyper-parameters K and d 1 , . . . , d K cannot be determined by maximizing the log-likelihood since they control the model complexity. We propose to set K at a given value (in the experiments we use K = 40 and K = 90 ) and to choose the intrinsic dimensions d k : � using an heuristic that links d k with the noise variance σ when known ; � using a model selection tool in order to select the best σ when unknown. 19 / 29

Estimation of intrinsic dimensions when σ is known With d k begin fixed, the MLE for the noise variance in the k th group is � p 1 σ 2 � | k = � a kj . p − d k j = d k +1 When the noise variance σ is known, this gives us the following heuristic : Heuristic. Given a value of σ 2 and for k = 1 , ..., K , we estimate the dimension d k by � � � � p � � � 1 � � � a kj − σ 2 � d k = argmin d . � � p − d � � j = d +1 20 / 29

Estimation of intrinsic dimensions when σ is unknown Each value of σ yields a different model, we propose to select the one with the better BIC (Bayesian Information Criterion) θ ) − ξ ( M ) BIC( M ) = ℓ (ˆ log( n ) , 2 where ξ ( M ) is the complexity of the model. why BIC is well-adapted for the selection of σ ? � if σ is too small, the likelihood is good but the complexity explodes ; � if σ is too high, the complexity is low but the likelihood is bad. 21 / 29

Estimation of intrinsic dimensions when σ is unknown    0 a k 1    ... 0 d k        0 a kd      ∆ k =        σ 2 0     ( p − d k ) 0 ...      σ 2 0 why BIC is well-adapted for the selection of σ ? � if σ is too small, the likelihood is good but the complexity explodes ; � if σ is too high, the complexity is low but the likelihood is bad. 21 / 29

Experiment : selection of σ with BIC 22 / 29

Numerical experiments Visualization of the intrinsic dimensions We display for each pixel the dimension of the most probable group of the patch around it. noisy clustering dimensions map clean Simpson Barbara 23 / 29

Regularizing effect of the dimension reduction 24 / 29

How to use Gaussian mixture models on patches for solving image - PowerPoint PPT Presentation

How to use Gaussian mixture models on patches for solving image inverse problems Workshop MixStatSeq Antoine Houdard LTCI, Tlcom ParisTech MAP5, Universit Paris Descartes antoine.houdard@telecom-paristech.fr houdard.wp.imt.fr Joint

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

Topic 4: Topic 4: Local analysis of image Local analysis of image patches patches patches

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M.

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Using Gaussian Mixture Models to Detect Figurative Language in Context Linlin Li and Caroline

ELEN E6884 - Topics in Signal Processing Recap Topic: Speech Recognition Gaussian Mixture

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

MLE 04-09-2019 For Gaussian and Mixture Gaussian Models Instructor - Sriram Ganapathy

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Automatically Automatically Finding Patches Finding Patches Using Genetic Using Genetic

Overlapping Patches for Dynamic Surface Problems C. Carlo Fazioli Drexel University 11 Jan 2014

CS 418: Interactive Computer Graphics Bezier Patches Eric Shaffer Some material taken from The

Fisher Vector Faces (FVF) in the Wild Karn Simonyan , Omkar Parkhi, Andrea Vedaldi, Andrew

Agenda Turnaround Phase 1. Overview 2. Financial review 3. Operating update Growth Phase 4.

Matt tt Gro roen ening Matt tt Gro roen ening Matth thew Abram Matth thew Abram

A mind once stretched by a new idea never regains its original dimensions . Thank You! For

Holdco 2 S.A. Q2-18 Interim Results August 29 th , 2018 1 Strictly Private and Confidential

PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA Research, Learning and

LEADERSHIP & INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of

PAPA Technical Meetings - 2017 HMA PRODUCTION BY YEAR 1,200,000 1,000,000 980,000 1,000,000

How to use Gaussian mixture models on patches for solving image - PowerPoint PPT Presentation

How to use Gaussian mixture models on patches for solving image inverse problems Workshop MixStatSeq Antoine Houdard LTCI, Tlcom ParisTech MAP5, Universit Paris Descartes antoine.houdard@telecom-paristech.fr houdard.wp.imt.fr Joint

Local Analysis of 2D Curve Patches Local Analysis of 2D Curve Patches Topic 4.2: Topic 4.2:

Topic 4: Topic 4: Local analysis of image Local analysis of image patches patches patches

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Deep Gaussian Mixture Models Cinzia Viroli (University of Bologna, Italy) joint with Geoff

Gaussian Mixture Models &amp; EM CE-717: Machine Learning Sharif University of Technology M.

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Using Gaussian Mixture Models to Detect Figurative Language in Context Linlin Li and Caroline

ELEN E6884 - Topics in Signal Processing Recap Topic: Speech Recognition Gaussian Mixture

Expectation Maximization Greg Mori - CMPT 419/726 Bishop PRML Ch. 9 K-Means Gaussian Mixture

Integrating Problem Solving 2020 Integrating Problem Solving 2020 Integrating Problem Solving

MLE 04-09-2019 For Gaussian and Mixture Gaussian Models Instructor - Sriram Ganapathy

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

Automatically Automatically Finding Patches Finding Patches Using Genetic Using Genetic

Overlapping Patches for Dynamic Surface Problems C. Carlo Fazioli Drexel University 11 Jan 2014

CS 418: Interactive Computer Graphics Bezier Patches Eric Shaffer Some material taken from The

Fisher Vector Faces (FVF) in the Wild Karn Simonyan , Omkar Parkhi, Andrea Vedaldi, Andrew

Agenda Turnaround Phase 1. Overview 2. Financial review 3. Operating update Growth Phase 4.

Matt tt Gro roen ening Matt tt Gro roen ening Matth thew Abram Matth thew Abram

A mind once stretched by a new idea never regains its original dimensions . Thank You! For

Holdco 2 S.A. Q2-18 Interim Results August 29 th , 2018 1 Strictly Private and Confidential

PROCESSING WITH HIERARCHICAL GAUSSIAN MIXTURES Ben Eckart, NVIDIA Research, Learning and

LEADERSHIP &amp; INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of

PAPA Technical Meetings - 2017 HMA PRODUCTION BY YEAR 1,200,000 1,000,000 980,000 1,000,000

Gaussian Mixture Models & EM CE-717: Machine Learning Sharif University of Technology M.

LEADERSHIP & INNOVATION GMM Development Limited is an ISO 9001 certifjed global supplier of