Image Segmentation Image Segmentation: Definitions � How do we know which groups of pixels in a � “Segmentation is the process of partitioning an digital image correspond to the objects to be image into semantically interpretable regions.” analyzed? - H. Barrow and J. Tennenbaum, 1978 � Objects may be uniformly darker or brighter than the � “An image segmentation is the partition of an background against which they appear image into a set of nonoverlapping regions whose � Black characters imaged against the white background of a union is the entire image. The purpose of page segmentation is to decompose the image into parts � Bright, dense potatoes imaged against a background that is that are meaningful with respect to a particular transparent to X-rays application.” - R. Haralick and L. Shapiro, 1992 1 2 Image Segmentation: Definitions Image Segmentation: Definitions � “The neurophysiologists’ and psychologists’ belief � “The partitioning problem is to delineate regions that figure and ground constituted one of the that have, to a certain degree, coherent attributes fundamental problems in vision was reflected in in the image. We will refer to this problem as the image partitioning problem . It is an important the attempts of workers in computer vision to implement a process called segmentation . The problem because, on the whole, objects and purpose of this process is very much like the idea coherent physical processes in the scene project of separating figure from ground ...” - D. Marr, 1982 into regions with coherent image attributes. Thus, the image partitioning problem can be viewed as a first approximation to the scene partitioning problem ...” - Y. LeClerc, 1989 3 4 1
Formal Definition � Given region R and uniformity criterion U, define predicate P(R) = True, if ∃ a ∋ |U(i,j) - a| < ε , ∀ (i,j) ∈ R � Partition image into subsets R i , i = 1, ..., m , such that � Complete: Image = ∪ R i , i = 1, ..., m � Disjoint subsets: R i ∩ R j = ∅ , ∀ i ≠ j � Uniform regions: P(R i ) = True, ∀ i � Maximal regions: P(R i ∪ R j ) = False, ∀ i ≠ j 5 6 7 8 2
Image Segmentation � Ideally, object pixels would be black (0 intensity) and background pixels white (maximum intensity) � But this rarely happens because � Pixels overlap regions from both the object and the background, yielding intensities between pure black and white - edge blur � Cameras introduce “noise” during imaging - measurement “noise” � Potatoes have non-uniform “thickness”, giving variations in brightness in X-ray - model “noise” 9 10 Image Segmentation by Thresholding Thresholding � How do we choose the threshold t ? � But if the objects and background occupy different ranges of gray levels, we can “mark” the object � Histogram : Gray level frequency distribution of pixels by a process called thresholding: the gray level image F � Let F(i,j) be the original, gray level image � h F (k) = number of pixels in F whose gray level is k � B(i,j) is a binary image (pixels are either 0 or 1) � H F (k) = number of pixels in F whose gray level is < = k created by thresholding F(i,j): � B(i,j) = 1 if F(i,j) <= t peak peak � B(i,j) = 0 if F(i,j) > t h(g) � We will assume that the 1’s are the object pixels and the 0’s are the background pixels valley 11 12 intensity, g 3
Thresholding Thresholding � P-tile method � In some applications we know approximately what � Mode (peak and valley) method percentage, p , of the pixels in the image come from � Find the two most prominent peaks of h objects � g is a peak if h F ( g ) > h F ( g ± ∆ g ), ∆ g = 1, ..., k � Might have one potato in the image, or one character. � Let g 1 and g 2 be the two highest peaks, with g 1 < g 2 � H F can be used to find the gray level, g , such that ~ p % � Find the deepest valley, g , between g 1 and g 2 of the pixels have intensity < = g � g is the valley if h F ( g ) < h F ( g ’) , ∀ g , g ’ ∈ [ g 1 , g 2 ] � Then, we can examine h F in the neighborhood of g to � Use g as the threshold find a good threshold (low valley point) � When image contains 2 normally-distributed classes, � Could also examine the binary images corresponding to alternative thresholds to choose a “best” one. E.g., one with can prove that the probability of misclassification is straightest edges, most easily recognized objects, etc. minimized when g is at the minimum point 13 14 15 16 4
17 18 19 20 5
21 22 An Advanced Threshold Selection Method: Thresholding Minimizing Kullback Information Distance � Hand selection � The observed histogram, f , is a mixture of the gray � Select a threshold by hand at the beginning of the day levels of the pixels from the object(s) and the � Use that threshold all day long! pixels from the background � Many threshold selection methods in the literature � In an ideal world the histogram would contain just two � Probabilistic methods spikes � Make parametric assumptions about object and background � But measurement noise, model noise and edge blur intensity distributions and then derive “optimal” thresholds spread these spikes out into hills � Structural methods � Make a parametric model of the shapes of the � Evaluate a range of thresholds with respect to properties of resulting binary images component histograms of the objects(s) and � Local thresholding background � Apply thresholding methods to image windows 23 24 6
Kullback Information Distance Kullback Information Distance � Parametric model - the � Now, once we choose a threshold, t , then all of component histograms are these unknown parameters are determined. assumed to be Gaussian − 1 / 2( g − µ o � Let f ( g ) be the observed and normalized histogram ) 2 p o σ o � p o and p b are the proportions f o ( g ) = 2 πσ o e o � f ( g ) = percentage of pixels from image having gray of the image that comprise level g the objects and background � µ o and µ b are the mean gray t ∑ p b ( t ) = 1 − p 0 ( t ) − 1 / 2( g − u b p o ( t ) = f ( g ) ) 2 levels of the objects and p b σ b g = 0 f b ( g ) = 2 πσ b e background max t ∑ ∑ µ b ( t ) = f ( g ) g � σ o and σ b are their standard µ o ( t ) = f ( g ) g g = t + 1 deviations g = 0 25 26 Kullback Information Distance Kullback Information Distance � A suitable similarity measure is the Kullback � So, once t is chosen we can “predict” what the directed divergence , defined as total normalized image histogram should be if our model (mixture of two Gaussians) is correct ∑ f ( g )log[ f ( g ) max K ( t ) = P t ( g )] � P t ( g ) = p o f o ( g ) + p b f b ( g ) g = 0 � The total normalized image histogram is really f ( g ) � If P t matches f exactly, then each term of the sum � So, the question reduces to: is 0 and K( t ) takes on its minimal value of 0 � Determine a suitable way to measure the similarity of � Gray levels where P t and f disagree are penalized P t and f by the log term, weighted by the importance of � Find the t that gives the highest similarity that gray level ( f ( g )) 27 28 7
Another Threshold Selection Method: Minimize Probability of Error Minimize Probability of Error � Using the same mixture model, we can find the t � For each threshold f o � Compute the that minimizes the predicted probability of error f b parameters of the two during thresholding Gaussians and the � Two types of errors proportions � Background points that are marked as object points. � Compute the two These are points from the background that are darker t probability of errors than the threshold t � Find the threshold max ∑ ∑ e b ( t ) = p b f b ( g ) e o ( t ) = p o f o ( g ) � Object points that are marked as background points. that gives These are points from the object that are brighter than g = 0 g = t + 1 � Minimal overall error the threshold � Most equal errors 29 30 Object Extraction from Binary Adjacency Images: Connected Components � Definition : Given a pixel (i,j) its � Definition: Given two disjoint sets of pixels, S and T, S is 4-(8) adjacent to T is there is a pixel 4-neighbors are the points (i’,j’) in S that is a 4-(8) neighbor of a pixel in T such that |i-i’| + |j-j’| = 1 � the 4-neighbors are (i±i, j) and (i,j±1) � Definition : Given a pixel (i,j) its 8-neighbors are the points (i’,j’) such that max(|i-i’|,|j-j’|) = 1 � the 8- neighbors are (i, j±1), (i±1, j) and (i±1, j±1) 31 32 8
Recommend
More recommend