Modélisation statistique, estimation et problèmes inverses Jalal Fadili GREYC CNRS, Caen France Cours Méthodes variationnelles et parcimonieuses en traitement des signaux et des images IHP 21 Mars 2008
De quoi s’agit-il dans ce cours ? x y = x + ε + Dictionnaire d ʼ atomes = X ε x y = h ⊛ x + ε (n x 1) (n x L) (L x 1) h + ε Cours IHP- 2
De quoi s’agit-il dans ce cours ? x Débruitage y = x + ε + Dictionnaire d ʼ atomes = X ε x Déconvolution y = h ⊛ x + ε (n x 1) (n x L) (L x 1) h + ε Cours IHP- 2
De quoi s’agit-il dans ce cours ? Modélisation statistique. Transformées et parcimonie. Estimation par seuillage. Estimation bayésienne. Problèmes inverses. Restauration d ʼ images. Cours IHP- 3
Plan Eléments de la théorie d ʼ estimation bayésienne. Le paradigme bayésien en images. Estimation bayésienne. Modélisation statistique des images et parcimonie. Estimation par ondelettes et au delà. Estimation minimax. Estimateurs par seuillage individuel. Estimateurs par seuillage joint . Problèmes inverses avec parcimonie. Formulation MAP vs formulation variationnelle. Cadre d ʼ optimisation. Applications en restauration d ʼ images. Cours IHP- 4
Plan Eléments de la théorie d ʼ estimation bayésienne. Paradigme bayésien. Estimation bayésienne. Modélisation statistique des images et parcimonie. Estimation par ondelettes et au delà. Estimation minimax. Estimateurs par seuillage individuel. Estimateurs par seuillage joint . Problèmes linéaires inverses avec parcimonie. Formulation MAP vs formulation variationnelle. Cadre d ʼ optimisation. Applications en restauration d ʼ images. Cours IHP- 5
Modèle d’observation The image is viewed as realization(s) of a RV or a random field whose degra- dation equation is: Y s = M [ F (( H X )) s ) ⊙ ε s ] , where ⊙ is any composition of two arguments (e.g. ’+’, ’ · ’). s ∈ S is the location index. ε s is the noise (random) (generally assumed AWGN but not necessarily so, e.g. speckle, Poisson, 1 f ). H is a (possibly non-linear) bounded degradation operator (e.g. convolu- tion with a PSF). F is a transformation not necessarily linear nor invertible (e.g. sensor- specific, variance stabilizing transform, sensing operator, etc). M operator reflecting missing data mechanism. How to recover unobserved X from observed Y An inverse ill-posed problem Cours IHP- 6
Problèmes inverses Well-posedness in Hadamard sense. At least one solution. Set of solutions converge to one solution. Continuous dependence on data. Ill-posedeness if one (or more) conditions are violated. Regularization to reduce the candidate solutions. Functional spaces. Stochastic models. Sparsity. etc. Cours IHP- 7
Paradigme bayésien �� �� = F Y H X ⊙ ε p ( x, z ) : prior distribution. z some other image features. p ( y | x, z ) : likelihood (given x and z ). ( p ( ε ) ). � p ( y ) : marginal distribution = X × Z p ( y | x, z ) p ( x, z ) dxd z . p ( y | x, z ) p ( x, z ) p ( x, z | y ) , posterior distribution: . p ( y ) Cours IHP- 8
Plan Eléments de la théorie d ʼ estimation bayésienne. Le paradigme bayésien en images. Estimation bayésienne. Modélisation statistique des images et parcimonie. Estimation par ondelettes et au delà. Estimation minimax. Estimateurs par seuillage individuel. Estimateurs par seuillage joint . Problèmes linéaires inverses avec parcimonie. Formulation MAP vs formulation variationnelle. Cadre d ʼ optimisation. Applications en restauration d ʼ images. Cours IHP- 9
Estimation bayésienne Bayesian estimation amounts to finding the operator D : Y �→ X , y �→ ˆ x = D ( y ) , x = arg inf ˆ R ( x, ˆ x = D ( y )) = E Y,X [ L ( x, D ( y ))] . D ∈ O n Proposition 1 Let X and Y be two RVs taking values in X and Y according to the above observation model. (i) The MAP estimator corresponds to L ( x, ˆ x ) = 1 − δ ( x − ˆ x ) , x MAP = arg max ˆ p X | Y ( x | y ) . x ∈ X x ) 2 , (ii) The MMSE estimator corresponds to L ( x, ˆ x ) = ( x − ˆ x MMSE = E [ X | Y ] . ˆ (iii) The MMAE estimator corresponds to L ( x, ˆ x ) = | x − ˆ x | , x | Y = y ) = 1 x MMAE such that Pr( X > ˆ ˆ 2 . Cours IHP- 10
Estimation bayésienne MAP involves solving an optimization problem. MMSE involves solving an integration problem: Closed-form: rather an exception than a rule. Analytical approximation (Laplace, saddlepoint, etc): requires smoothness. Numerical quadrature: unrealistic in high dimensions. Monte-Carlo. For mutually independent iid gaussian signal and noise, MAP, MMSE and Wiener are the same. Cours IHP- 11
Plan Eléments de la théorie d ʼ estimation bayésienne. Le paradigme bayésien en images. Estimation bayésienne. Modélisation statistique des images et parcimonie. Estimation par ondelettes et au delà. Estimation minimax. Estimateurs par seuillage individuel. Estimateurs par seuillage joint . Problèmes linéaires inverses avec parcimonie. Formulation MAP vs formulation variationnelle. Cadre d ʼ optimisation. Applications en restauration d ʼ images. Cours IHP- 12
Propriétés d’invariance Ax. 1 Scale invariance φ − 1 ( s ) � � p ( T φ ( X )) = p ( X ) , T φ ( X ) ( s ) = X , φ ( s ) = λ s + t, X ⊂ D . No scale-invariant probability measure on image functions, but on Schwartz distributions [MumfordGidas01]. � Self-similarity: S X ( ω ) = h ( θ ) ω − r , ω = ω 2 1 + ω 2 2 . Ax. 2 Clutter and infinite divisibility Images with clutter c are random samples from an infinitely divisible prob- ability measure p c ∈ D ′ , such that p d + c = p c ∗ p d . An image X with clutter c can be formed as a sum of independent ( X i ) 1 ≤ i ≤ m each with clutter c/n . Form an image with a certain clutter as the superposition of independent images with less clutter. Be careful with partial occlusions (violates independence). Cours IHP- 13
Propriétés d’invariance The Levy-khintchine theorem makes explicit the nature of infinite divisibility, � X ( s 1 , s 2 ) = G i ( λ i s 1 + u i , λ i s 2 + v i ) , i with ( λ i , u i , v i ) a Poisson process in R + × R 2 with density dudvd λ / λ , and G i are independent random objects sampled from a reduced Levy measure. [MumfordGidas01] called this a random wavelet expansion. They proved its convergence as distributions under mild conditions on the re- duced Levy measure. Cours IHP- 14
A priori de parcimonie Choose a Hilbert space equipped with a basis or frame. Project the original data in that space where all components (coefficients) but a few are zero (concept of sparsity) : � α γ ϕ γ , ϕ γ ∈ D Φ . x = γ ∈ Γ Non-linear approximation theory (GP) Dictionary of atoms few big many small = X sorted index (n x 1) (n x L) (L x 1) Matter of dimensionality reduction. Many images have a sparse gradient and fall within this intuitive model (e.g. TV norm). Cours IHP- 15
Comportement statistique Expected marginal statistical behavior: Unimodal, centered at zero. Non-gaussian sharply peaked distribution. Heavy tails : weak sparsity and NLA error decay. Expected joint statistical behavior: Persistence across bands/scales. Intra-band/scale dependence. Histogram ! 2 10 ! 3 10 Dashed: GGD [Mallat 89], dotted: α -stable [Achim et al. 01, Boubchir and Fadili 04], solid: Bessel K form [Grenander et al. 01, Fadili et ! 4 10 al. 03]. ! 5 10 ! 6 10 Cours IHP- 16 ! 100 ! 50 0 50 100 150
Modèle de superposition revisité Transported Generator Model [Grenander et al. 02]: A i G i ( λ i s 1 + u i , λ i s 2 + v i ) , ( s 1 , s 2 ) ∈ [0 , 1] 2 . � X ( s 1 , s 2 ) = i where a i are iid N (0 , 1) , λ i are iid uniform in [0 , 1] , and ( u i , v i ) are independent 2D homogeneous P ( µ ) . Lemma 1 Let X h = X ∗ h , then √ d = Z X h U , i ( G i ∗ h ) 2 > 0 . Moreover, the RV X h is leptokurtic and heavy- where U = � tailed. Typically, h stems from a band-pass filter bank, e.g. wavelets. Cours IHP- 17
Mélange d’échelle de gaussiennes Definition 1 (Andrews and Mallows 74) Let X be a RV with real-valued realiza- tions. Under the SMG, there exist two independent RVs U ≥ 0 and Z ∼ N (0 , 1) such that: √ d = Z X U . Property 1 SMG is a subset of the elliptically symmetric distributions [Kotz et al. 89]. p X (0) exists iif E [ U − 1 / 2 ] < + ∞ . The pdf of X is: � + ∞ 1 u − 1 / 2 e − x 2 2 u f U ( u ) du . p X ( x ) = √ 2 π 0 It is unimodal, symmetric around the mode and differentiable almost every- where. The characteristic function of X is: � ω 2 � Φ X ( ω ) = L [ p U ] 2 L is the Laplace transform. Cours IHP- 18
Propriétés Necessary and sufficient conditions for such a representation to exist: Proposition 1 (Andrews and Mallows 74) The RV X has a SMG representation iff the k th derivatives of f X ( √ y ) have alternating sign, i.e.: � k � − d p X ( √ y ) ≥ 0 ∀ y > 0 dy √ d U with random U ≥ 0 and Z ∼ N (0 , σ 2 ) , = Z Lemma 1 (Fadili et al. 05) If X then kurtosis ( X ) > 0 = ⇒ the symmetric distribution of X is necessarily sharply peaked (leptokurtic) with heavy tails. Good candidate as a sparsity prior. Generalization of Laplacian ( ℓ 1 -penalty), GGD, α -stable, BKF. There is explicit relationship between the parameters of the SMG prior model and the Besov space to which the wavelet coeffs of the image belong a.s. [Fadili et al. 05]. Cours IHP- 19
Recommend
More recommend