24/10/2014 Sistemi Intelligenti Stima MAP Alberto Borghese Università degli Studi di Milano Laboratory of Applied Intelligent Systems (AIS-Lab) Dipartimento di Scienze dell’Informazione borghese@di.unimi.it Overview Filtering images MAP, Tikhonov and Poisson model of the noise A-priori and Markov Random Fields Cost function minimization 2 of 72 1
24/10/2014 Images are corrupted by noise… i) When measurement of some physical parameter is performed, noise corruption cannot be avoided. ii) Each pixel of a digital image measures a number of photons. Therefore, from i) and ii)… …Images are corrupted by noise! http://ais-lab.dsi.unimi.it 3 / 46 A general framework � f = {f 1 , f 2 , f M }, f k ∈ R M e.g. Pixel true luminance g k ∈ R N � g = {g 1 , g 2 , g M } e.g. Pixel measured luminance � g = A f + h + v -> determining f is a deblurring problem (the measuring mean transforms the image: scale + offset) � g = I f + v -> determining f is a denoising problem (the image is a copy of the real one with the addition of noise) It is a general framework. It is a linear framework. h is the background radiation v is the noise 4 of 72 2
24/10/2014 Gaussian noise and likelihood � Images are composed by a set of pixels, f ( f is a vector!) � Let us assume that the noise is Gaussian and that its mean and variance is equal for all pixels; � Let g n. i be the measured value for the i-th pixel (n = noise); � Let and f i be the true (noiseless) value for the i-th pixel; � How can we quantify the probability to measure the image f , given the probability density function for each pixel? � Likelihood function, L( g n | f ): − 2 g f N ( ) ∏ N ( ) 1 1 ∏ g n f = = − n i i L w p g f , | ; | exp n i i , σ π σ 2 2 i = i = 1 1 � L( g n | f ) describes the probability to measure the image g n , given the noise free value for each pixel, f . But we do not know these values…. 5 of 72 5 / 46 Statistical formulation of image restauration Measuring an image g taken from an object, f , we want to determine f , when g is corrupted by noise: g n = Af + b + noise � f ? It is a typical inverse problem . A is a linear operator that describes the transformation (mapping) from f to g (e.g. perspective projection, sensor transfer function, A = I for denoising …). b is the background radiation. It is the measure g , when no signal arrives to the sensor. Each pixel is considered an independent process (white noise). For each pixel therefore, we want to find f that maximize: p( g n ; f ) Being the pixels independent, the total probability can be written in terms of product of independent probabilities (likelihood function): N ( ) ( ) ∏ L g f = p g f ; , ; n n i i i = 1 L is the likelihood function of g n , given the object f . 6 of 72 3
24/10/2014 Do we get anywhere? L is the likelihood function of g n , given the object f . N ( ) ∏ ( ) = L g f p g f ; ; n i i = i 1 Determine { f i } such that L is maximized. Negative log-likelihood is usually considered to deal with sums: N ( ( ) ) ∑ − L = − p g f log( (.)) ln , ; n i i = i 1 g − f 2 ( ) N T A) -1 A T g n 1 1 f = (A ∑ f g g g f f f σ = − ⋅ − n , i i => , .... ; , .... ; 0 , ln exp n n n N n n n N , 1 , 2 , , 1 , 2 , π σ σ 2 2 i = 1 if A = I ( ) N N 1 1 − + − 2 f ∑ ∑ g f min( (.)) = min ln f = g n n i i , π σ σ 2 2 2 = = f i i { } 1 1 i f { } i The system has a single solution, that is good. The solution is f i = g n,i , not a great result…. Can we do any better? 7 of 72 Overview Filtering images MAP, Tikhonov and Poisson model of the noise A-priori and Markov Random Fields Cost function minimization 8 of 72 4
24/10/2014 The Bayesian framework We assume that the object f is a realization of the “abstract” object F that can be characterized statistically as a density probability on F. f is extracted randomly from F. The probability p(g n | f) becomes a conditional probability: J 0 = p(g n | f = f*) Under this condition, the probability of observing f and g n (joint probability) can be p g f written as the product of the conditional probability by a-priori probability ( | ) n on f, p f : p g f p ( | ) p(g n , f) = n f As we are interested in determining f, we have to write the conditional probability of f given g n : p(f | g n ). We apply Bayes theorem: p g f p p ( | ) ( ) n f f p f g = = L g f | ( ; ) n n p p g g n n 9 of 72 MAP estimate with logarithms p g f p p ( | ) ( ) = n f = f p f g L g f | ( ; ) n n p p g g n n Logarithms help: ( ) p g f p { ( ) ( ) } ln ( | ) ( ( ) ) ( ) − = − n f = − + − p f g p g f p p ln | ln ( | ) ln ln n n f g p n g n We maximize the MAP of f | g n , by minimizing: p g f p { ( ) ( ) } ( | ) ( ) n f arg min − = arg min − p g f + p − p ln ln ( | ) ln ln n f g p n g f f n We explicitly observe that the marginal distribution of p g,n is not dependent on f. It does not affect the minimization and it can be neglected. It represents the statistical distribution of the measurements alone. 10 of 72 5
24/10/2014 MAP estimate with logarithms We maximize the MAP of f | g n , by minimizing: { ( ) } { ( ) } ( ) − = − + arg min p g f p arg min p g f p ln ( | ) ln ( | ) ln n f n f f f ( ) J g f n ; Likelihood = i 0 , A-priori ( ) adherence to the data J R f Depending on the shape of the noise (inside the likelihood) and the a- priori distribution of f(.), J R (f), we get different solutions. 11 of 72 Gibb’s priors We often define the a-priori term, J R (f), as Gibb’s prior: − 1 U f ( ) +∞ 1 = ∫ − β 1 U f β ( ) = p e = Z e df t cos f Z − ∞ ( ) ( ) 1 J f = p = − Z − U f ( ) ln ln ( ) f R β U(f) is also termed potential => J R (f) is a linear function of the potential U(f). β describes how fast the potential (the cost) decreases with U(f). 12 of 72 6
24/10/2014 Gaussian noise and a-priori term on the norm of the solution { ( ) } { ( ) } = ( ) f arg min − p g f p = arg min − p g f + p = ln ( | ) ln ( | ) ln n f n f f f { ( ) ( ) } = arg min J g f + J f ; n R 0 f 2 ( ) ∑ Gaussian noise on the data: J g f = t + g − Af ; cos . n n i i 0 , i We choose as a-priori term the squared norm of the function f, weighted by P . − 1 2 Pf 1 β = p e ( ) 1 = + J f t f 2 P = I cos f Z R i β = 1 f ∑ ∑ 2 2 arg min g − Af + f n i i i , β i i f 13 of 72 Tikhonov regularization = 1 f ∑ ∑ 2 2 arg min g − Af + f n i i i , β i i f = f ∑ ∑ 2 − + λ 2 arg min g Af Pf n i i i , i i f (cf. Ridge regression and Levemberg-Marquardt optimization) It is a quadratic cost function. We find f minimizing with respect to f the cost function: ( ) f : A T g − A T Af + λ P T Pf = => A T g = A T A + λ P T P 0 n n ( ) f f − + λ = => = + λ : A T g A T Af P T Pf A T g A T A P T P 0 n n ( ) f f T = T + λ : A g A A I P = I n Poggio and Girosi, 1990 14 of 72 7
Recommend
More recommend