Image Registration Using Mutual Information CMPUT 615 Nilanjan Ray slides prepared from: http://www.cse.msu.edu/~cse902/
Shannon’s Entropy • Let, p i : probability of occurrence of i th event • Information content of the even i : ( ) log( 1 / ) I p p i i • Shannon’s Entropy: log( 1 / ) H p p i i i How do you interpret these formulas?
Interpretations • An infrequently occurring event provides more information than a frequently occurring event • Entropy measures the overall uncertainly of occurrence of events in a system • If a histogram (probability density function) is highly peaked, entropy is low. The higher the dispersion in the histogram, the larger would be the entropy.
Image Registration with Shannon’s Entropy • Generate a 2-D joint histogram p ( i , j ) for two images • If the two images are well registered, p ( i , j ) will be less dispersed, and entropy will be low.
Entropy for Image Registration • Using joint entropy for registration – Define joint entropy to be: ( , ) ( , ) log[ ( , )] H A B p i j p i j , i j – Images are registered when one is transformed relative to the other to minimize the joint entropy – The dispersion in the joint histogram is thus minimized
Definitions of Mutual Information • Three commonly used definitions: – 1) I(A,B) = H(B) - H(B|A) = H(A) - H(A|B) • Mutual information is the amount that the uncertainty in B (or A) is reduced when A (or B) is known. – 2) I(A,B) = H(A) + H(B) - H(A,B) • Maximizing the mutual info is equivalent to minimizing the joint entropy (last term) • Advantage in using mutual info over joint entropy is it includes the individual input’s entropy • Works better than simply joint entropy in regions of image background (low contrast) where there will be low joint entropy but this is offset by low individual entropies as well so the overall mutual information will be low
Definitions of Mutual Information II ( , ) p a b – 3) ( , ) ( , ) log I A B p a b ( ) ( ) p a p b , a b • This definition is related to the Kullback-Leibler distance between two distributions • Measures the dependence of the two distributions • In image registration I(A,B) will be maximized when the images are aligned • In feature selection choose the features that minimize I(A,B) to ensure they are not related.
Additional Definitions of Mutual Information • Two definitions exist for normalizing Mutual information: – Normalized Mutual Information: ( ) ( ) H A H B ( , ) NMI A B ( , ) H A B – Entropy Correlation Coefficient: 2 ( , ) 2 ECC A B ( , ) NMI A B
Derivation of M. I. Definitions ( , ) ( , ) log( ( , ) ), where ( , ) ( | ) ( ) H A B p a b p a b p a b p a b p b , a b ( , ) [ ( | ) ( )] log[ ( | ) ( )] H A B p a b p b p a b p b , a b ( , ) [ ( | ) ( )] {log[ ( | )] log[ ( )]} H A B p a b p b p a b p b , a b ( , ) ( | ) log[ ( | )] ( ) ( ) log( ( )) ( | ) H A B p a b p a b p b p b p b p a b , , a b a b ( , ) ( | ) log[ ( | )] ( ) ( | ) ( ) log( ( )) H A B p a b p a b p b p a b p b p b a b b a ( , ) ( | ) log[ ( | )] ( ) log( ( )) H A B p a b p a b p b p b a b ( , ) ( | ) ( ) H A B H A B H B therefore ( , ) ( ) ( | ) ( ) ( ) ( , ) I A B H A H B A H A H B H A B
Properties of Mutual Information • MI is symmetric: I(A,B) = I(B,A) • I(A,A) = H(A) • I(A,B) <= H(A), I(A,B) <= H(B) – info each image contains about the other cannot be greater than the info they themselves contain • I(A,B) >= 0 – Cannot increase uncertainty in A by knowing B • If A, B are independent then I(A,B) = 0 • If A, B are Gaussian then: 2 1 ( , ) log( 1 ) I A B 2
M.I. for Image Registration
M.I. for Image Registration
M.I. for Image Registration
M.I. Processing Flow for Image Registration Input Pre-processing Images Probability M.I. Density Estimation Estimation Image Optimization Output Transformation Scheme Image
Pseudo Code • Initialize the transformation parameters a ; I is the template image, J is the image to be registered to I . • Transform J to J ( a ) • Repeat – Step 1: Compute MI( a ) = mutual information between J ( a ) and I – Step 2: Find a by optimization so that MI( a + a ) > MI( a ) – Step 3: Update transformation parameters: a = a + a – Step 4: Transform J to J ( a ) • Until convergence What type of optimization can be applied here?
Recommend
More recommend