image segmentation
play

Image Segmentation Perceptual and Sensory Augmented Computing Marc - PowerPoint PPT Presentation

Image Segmentation Perceptual and Sensory Augmented Computing Marc Pollefeys Computer Vision WS 0/09 ETH Zurich Slide credits: V. Ferrari, K. Grauman, B. Leibe, S. Lazebnik, S. Seitz,Y Boykov, W. Freeman, P. Kohli Topics of This Lecture


  1. Segmentation as Clustering • Depending on what we choose as the feature space , we can group pixels in different ways. Perceptual and Sensory Augmented Computing • Grouping pixels based on texture similarity F 1 F 2 Computer Vision WS 08/09 Filter bank of 24 filters … F 24 • Feature space: filter bank responses (e.g. 24D) Slide credit: Kristen Grauman

  2. Spatial coherence • Assign a cluster label per pixel à à possible discontinuities Perceptual and Sensory Augmented Computing Labeled by cluster center ’ s Original Computer Vision WS 08/09 intensity ? • How can we ensure they 3 are spatially smooth? 2 1 Slide adapted from Kristen Grauman

  3. Spatial coherence • Depending on what we choose as the feature space , we can group pixels in different ways. Perceptual and Sensory Augmented Computing • Grouping pixels based on intensity+position similarity Intensity Computer Vision WS 08/09 Y X ⇒ Way to encode both similarity and proximity. Slide adapted from Kristen Grauman

  4. K-Means without spatial information • K-means clustering based on intensity or color is essentially vector quantization of the image attributes Perceptual and Sensory Augmented Computing Ø Clusters don ’ t have to be spatially coherent Image Intensity-based clusters Color-based clusters Computer Vision WS 08/09 Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce

  5. K-Means with spatial information • K-means clustering based on intensity or color is essentially vector quantization of the image attributes Perceptual and Sensory Augmented Computing Ø Clusters don ’ t have to be spatially coherent • Clustering based on (r,g,b,x,y) values enforces more spatial coherence Computer Vision WS 08/09 Slide adapted from Svetlana Lazebnik Image source: Forsyth & Ponce

  6. Summary K-Means • Pros Ø Simple, fast to compute Perceptual and Sensory Augmented Computing Ø Converges to local minimum of within-cluster squared error • Cons/issues Ø Setting k? Computer Vision WS 08/09 Ø Sensitive to initial centers Ø Sensitive to outliers Ø Detects spherical clusters only Ø Assuming means can be computed Slide credit: Kristen Grauman

  7. Probabilistic Clustering • Basic questions Ø What ’ s the probability that a point x is in cluster m ? Perceptual and Sensory Augmented Computing Ø What ’ s the shape of each cluster? • K-means doesn ’ t answer these questions. • Basic idea Ø Instead of treating the data as a bunch of points, assume that Computer Vision WS 08/09 they are all generated by sampling a continuous function. Ø This function is called a generative model. Ø Defined by a vector of parameters θ Slide credit: Steve Seitz

  8. Mixture of Gaussians Perceptual and Sensory Augmented Computing • One generative model is a mixture of Gaussians (MoG) Computer Vision WS 08/09 Ø K Gaussian blobs with means µ b covariance matrices V b , dimension d – Blob b defined by: Ø Blob b is selected with probability Ø The likelihood of observing x is a weighted mixture of Gaussians , Slide adapted from Steve Seitz

  9. Expectation Maximization (EM) Perceptual and Sensory Augmented Computing • Goal Find blob parameters θ that maximize the likelihood function Ø overall all N datapoints Computer Vision WS 08/09 • Approach: E-step: given current guess of blobs, compute probabilistic ownership 1. of each point M-step: given ownership probabilities, update blobs to maximize 2. likelihood function Repeat until convergence 3. Slide adapted from Steve Seitz

  10. EM Details • E-step Ø Compute probability that point x is in blob b , given current Perceptual and Sensory Augmented Computing guess of θ • M-step Ø Compute overall probability that blob b is selected Computer Vision WS 08/09 ( N data points) Ø Mean of blob b Ø Covariance of blob b Slide adapted from Steve Seitz

  11. Applications of EM • Turns out this is useful for all sorts of problems Ø Any clustering problem Perceptual and Sensory Augmented Computing Ø Any model estimation problem Ø Missing data problems Ø Finding outliers Ø Segmentation problems – Segmentation based on color – Segmentation based on motion Computer Vision WS 08/09 – Foreground/background separation Ø ... • EM demo Ø http://lcn.epfl.ch/tutorial/english/gaussian/html/index.html Slide credit: Steve Seitz

  12. Segmentation with EM Original image Perceptual and Sensory Augmented Computing EM segmentation results Computer Vision WS 08/09 k=2 k=3 k=4 k=5 Slide credit: B. Leibe Image source: Serge Belongie

  13. Summary: Mixtures of Gaussians, EM • Pros Ø Probabilistic interpretation Perceptual and Sensory Augmented Computing Ø Soft assignments between data points and clusters Ø Generative model, can predict novel data points Ø Relatively compact storage ( ) • Cons Ø Initialization Computer Vision WS 08/09 – often a good idea to start from output of k-means Ø Local minima Ø Need to know number of components K – solutions: model selection (AIC, BIC), Dirichlet process mixture Ø Need to choose generative model (math form of a cluster ?) Ø Numerical problems are often a nuisance Slide adapted from B. Leibe

  14. Topics of This Lecture • Introduction Ø Gestalt principles Perceptual and Sensory Augmented Computing Ø Image segmentation • Segmentation as clustering Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM Computer Vision WS 08/09 • Model-free clustering: Mean-Shift • Graph theoretic segmentation: Normalized Cuts • Interactive Segmentation with GraphCuts

  15. Finding Modes in a Histogram Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 • How many modes are there? Ø Mode = local maximum of a given distribution Ø Easy to see, hard to compute Slide adapted from Steve Seitz

  16. Mean-Shift Segmentation • An advanced and versatile technique for clustering- based segmentation Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 http://coewww.rutgers.edu/riul/FORMER/comanici/MSPAMI/msPamiResults.html D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002. Slide credit: Svetlana Lazebnik

  17. Mean-Shift Algorithm Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 • Iterative Mode Search Initialize random seed center and window W 1. Calculate center of gravity (the “ mean ” ) of W: 2. Shift the search window to the mean 3. Repeat steps 2+3 until convergence 4. Slide adapted from Steve Seitz

  18. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  19. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  20. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  21. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  22. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  23. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Mean Shift vector Slide by Y . Ukrainitz & B. Sarel

  24. Mean-Shift Region of interest Center of mass Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Slide by Y . Ukrainitz & B. Sarel

  25. Real Modality Analysis Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Tessellate the space Run the procedure in parallel with windows Slide by Y . Ukrainitz & B. Sarel

  26. Real Modality Analysis Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 The blue data points were traversed by the windows towards the mode. Slide by Y . Ukrainitz & B. Sarel

  27. Mean-Shift Clustering • Cluster: all data points in the attraction basin of a mode • Attraction basin: the region for which all trajectories Perceptual and Sensory Augmented Computing lead to the same mode Computer Vision WS 08/09 Slide by Y . Ukrainitz & B. Sarel

  28. Mean-Shift Clustering/Segmentation • Choose features (color, gradients, texture, etc) • Initialize windows at individual pixel locations Perceptual and Sensory Augmented Computing • Start mean-shift from each window until convergence • Merge windows that end up near the same “ peak ” or mode Computer Vision WS 08/09 Slide adapted from Svetlana Lazebnik

  29. Mean-Shift Segmentation Results Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 http://coewww.rutgers.edu/riul/FORMER/comanici/MSPAMI/msPamiResults.html Slide credit: Svetlana Lazebnik

  30. Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Slide credit: Svetlana Lazebnik More Results

  31. Summary Mean-Shift • Pros Ø General, application-independent tool Perceptual and Sensory Augmented Computing Ø Model-free, does not assume any prior shape (spherical, elliptical, etc.) on data clusters Ø Just a single parameter (window size h) – h has a physical meaning (unlike k-means) == scale of clustering Ø Finds variable number of modes given the same h Ø Robust to outliers Computer Vision WS 08/09 • Cons Ø Output depends on window size h Ø Window size (bandwidth) selection is not trivial Ø Computationally rather expensive Ø Does not scale well with dimension of feature space Slide adapted from Svetlana Lazebnik

  32. Topics of This Lecture • Introduction Ø Gestalt principles Perceptual and Sensory Augmented Computing Ø Image segmentation • Segmentation as clustering Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM Computer Vision WS 08/09 • Model-free clustering: Mean-Shift • Graph theoretic segmentation: Normalized Cuts • Interactive Segmentation with GraphCuts

  33. Images as Graphs q Perceptual and Sensory Augmented Computing w pq w p Computer Vision WS 08/09 • Fully-connected graph Ø Node (vertex) for every pixel Ø Edge between every pair of pixels (p,q) Ø Affinity weight w pq for each edge – w pq measures similarity – Similarity is inversely proportional to difference (in color, texture, position, …) Slide adapted from Steve Seitz

  34. Segmentation by Graph Cuts Perceptual and Sensory Augmented Computing w A B C Computer Vision WS 08/09 • Break Graph into Segments Ø Delete edges crossing between segments Ø Easiest to break edges with low similarity (low weight) – Similar pixels should be in the same segments – Dissimilar pixels should be in different segments Slide adapted from Steve Seitz

  35. Measuring Affinity { } 2 • Distance aff x y ( , ) exp 1 x y = − − 2 2 σ d Perceptual and Sensory Augmented Computing { } 2 • Intensity aff x y ( , ) exp I x ( ) I y ( ) 1 = − − 2 2 σ d { } 2 • Color aff x y ( , ) exp dist c x c y ( ), ( ) 1 ( ) = − Computer Vision WS 08/09 2 2 σ d (some suitable color space distance) { } • Texture 2 aff x y ( , ) exp 1 f x ( ) f y ( ) = − − 2 2 σ d (vectors of filter outputs) Source: Forsyth & Ponce

  36. Scale Affects Affinity • Small σ : group only nearby points • Large σ : group far-away points Perceptual and Sensory Augmented Computing large σ data points Computer Vision WS 08/09 small σ Small σ Medium σ Large σ Slide adapted from Svetlana Lazebnik Image Source: Forsyth & Ponce

  37. Graph Cut (GC) Perceptual and Sensory Augmented Computing B A • GC = edges whose removal partitions a graph in two • Cost of a cut Computer Vision WS 08/09 cut ( A , B ) w ∑ = Ø Sum of weights of cut edges: p , q p A , q B ∈ ∈ • A graph cut gives us a segmentation Ø What is a “ good ” graph cut and how do we find one? Slide adapted from Steve Seitz

  38. Graph Cut Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Here, the cut is nicely defined by the block-diagonal structure of the affinity matrix. ⇒ How can this be generalized? Slide credit: B. Leibe Image Source: Forsyth & Ponce

  39. Minimum Cut • We can do segmentation by finding the minimum cut in a graph Perceptual and Sensory Augmented Computing Efficient algorithms exist for doing this Ø • Drawback: Weight of cut proportional to number of edges in the cut Ø Minimum cut tends to cut off very small, isolated components Ø Computer Vision WS 08/09 Cuts with lesser weight than the ideal cut Ideal Cut Slide credit: Khurram Hassan-Shafique

  40. Normalized Cut (NCut) • Min-cut has bias toward partitioning out small segments • This can be fixed by normalizing for size of segments Perceptual and Sensory Augmented Computing • The normalized cut cost is: cut ( A , B ) cut ( A , B ) assoc ( A , V ) + assoc ( B , V ) assoc ( A , V ) = sum of weights from A to all nodes to graph Computer Vision WS 08/09 • The exact solution is NP-hard but an approximation can be computed by solving a generalized eigenvalue problem. J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000 Slide adapted from Svetlana Lazebnik

  41. Interpretation as a Dynamical System Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 • Treat the edges as springs and ‘ shake ’ the system Ø Elasticity proportional to cost Ø Vibration “ modes ” correspond to segments – Can compute these by solving a generalized eigenvector problem Slide adapted from Steve Seitz

  42. NCuts as a Generalized Eigenvector Problem • Definitions the affinity matrix, W : W i j ( , ) w ; = i j , Perceptual and Sensory Augmented Computing the diag. matrix, D : D i i ( , ) W i j ( , ); ∑ = j N a vector in x : {1, 1} , ( ) x i 1 i A . − = ⇔ ∈ • Rewriting Normalized Cut in matrix form: Computer Vision WS 08/09 cut (A,B) cut (A,B) NCut (A,B) = + assoc (A,V) assoc (B,V) D i i ( , ) ∑ T T (1 x ) ( D W )(1 x ) (1 x ) ( D W )(1 x ) + − + − − − x 0 > ; k i = + = T T k 1 D 1 (1 k )1 D 1 D i i ( , ) ∑ − i ... = Slide credit: Jitendra Malik

  43. Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Slide credit: Jitendra Malik Some More Math…

  44. NCuts as a Generalized Eigenvalue Problem • After simplification, we get T This is hard, y ( D W y ) − T NCut A B ( , ) , with y {1, b }, y D 1 0. Perceptual and Sensory Augmented Computing = ∈ − = as y is discrete! i T y Dy • This is a Rayleigh Quotient Ø Solution given by the “ generalized ” eigenvalue problem ( D W ) y Dy − = λ Computer Vision WS 08/09 Ø Solved by converting to standard eigenvalue problem Relaxation: y is continuous. with • Subtleties Ø Smallest eigenvector is with eigenvalue 0 (and ) Ø Optimal solution is second smallest eigenvector Ø Gives continuous result—must convert into discrete values of y Slide adapted from Alyosha Efros

  45. NCuts Example Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Smallest eigenvectors NCuts segments Slide credit: B. Leibe Image source: Shi & Malik

  46. NCuts: Overall Procedure 1. Construct a weighted graph G=(V,E) from an image. 2. Connect each pair of pixels, and assign graph edge Perceptual and Sensory Augmented Computing weights, Prob. that and belong to the same region. W i j ( , ) i j = 3. Solve for the smallest few ( D W y ) Dy − = λ eigenvectors. This yields a continuous solution. Computer Vision WS 08/09 4. Threshold eigenvectors to get a discrete cut Ø This is where the approximation is made (we ’ re not solving NP). 5. Recursively subdivide if NCut value is below a pre- specified value. NCuts Matlab code available at http://www.cis.upenn.edu/~jshi/software/ Slide credit: Jitendra Malik

  47. Color Image Segmentation with NCuts Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Slide credit: Steve Seitz Image Source: Shi & Malik

  48. Perceptual and Sensory Augmented Computing Computer Vision WS 08/09 Results with Color & Texture

  49. Summary: Normalized Cuts • Pros: Ø Generic framework, flexible to choice of function that computes Perceptual and Sensory Augmented Computing weights ( “ affinities ” ) between nodes Ø Does not require any model of the data distribution • Cons: Ø Time and memory complexity can be high Computer Vision WS 08/09 – Dense, highly connected graphs ⇒ many affinity computations – Solving eigenvalue problem Ø Preference for balanced partitions – If a region is uniform, NCuts will find the modes of vibration of the image dimensions Slide credit: Kristen Grauman

  50. Segmentation: Caveats • We ’ ve looked at bottom-up ways to segment an image into regions, yet finding meaningful segments is Perceptual and Sensory Augmented Computing intertwined with the recognition problem. • Often want to avoid making hard decisions too soon • Difficult to evaluate; when is a segmentation successful? Computer Vision WS 08/09 Slide credit: Kristen Grauman

  51. Topics of This Lecture • Introduction Ø Gestalt principles Perceptual and Sensory Augmented Computing Ø Image segmentation • Segmentation as clustering Ø k-Means Ø Feature spaces Ø Mixture of Gaussians, EM Computer Vision WS 08/09 • Model-free clustering: Mean-Shift • Graph theoretic segmentation: Normalized Cuts • Interactive Segmentation with GraphCuts

  52. Markov Random Fields • Allow rich probabilistic models for images • But built in a local, modular way Perceptual and Sensory Augmented Computing Ø Learn local effects, get global effects out Observed evidence Computer Vision WS 08/09 Hidden “ true states ” Neighborhood relations Slide credit: William Freeman

  53. MRF Nodes as Pixels (or Patches) Image pixels Perceptual and Sensory Augmented Computing ( , x y ) Image Computer Vision WS 08/09 Φ i i states (e.g. foreground/background) ( , x x ) Ψ i j Slide adapted from William Freeman

  54. Network Joint Probability Perceptual and Sensory Augmented Computing P x y ( , ) ( , x y ) ( , x x ) ∏ ∏ = Φ Ψ i i i j i i j , states Image-state state-state Computer Vision WS 08/09 compatibility compatibility Image function function Neighboring Local nodes observations Slide adapted from William Freeman

  55. Energy Formulation • Joint probability P x y ( , ) ( , x y ) ( , x x ) ∏ ∏ = Φ Ψ i i i j Perceptual and Sensory Augmented Computing i i j , • Maximizing the joint probability is the same as minimizing the log log ( , ) P x y log ( , x y ) log ( , x x ) ∑ ∑ = Φ + Ψ i i i j i i j , Computer Vision WS 08/09 E x y ( , ) ( , x y ) ( , x x ) ∑ ∑ = ϕ + ψ i i i j i i j , • This is similar to free-energy problems in statistical mechanics (spin glass theory). We therefore draw the analogy and call E an energy function . • ϕ and ψ are called potentials . Slide credit: B. Leibe

  56. Energy Formulation • Energy function ( , x y ) ϕ i i E x y ( , ) ( , x y ) ( , x x ) ∑ ∑ = ϕ + ψ i i i j ( , x x ) ψ Perceptual and Sensory Augmented Computing i j i i j , Unary Pairwise potentials potentials • Unary potentials ϕ Ø Encode local information about the given pixel/patch Computer Vision WS 08/09 Ø How likely is a pixel/patch to be in a certain state ? (e.g. foreground/background)? • Pairwise potentials ψ Ø Encode neighborhood information Ø How different is a pixel/patch ’ s label from that of its neighbor? (e.g. here independent of image data, but later based on intensity/color/texture difference) Slide adapted from B. Leibe

  57. Energy Minimization • Goal: ( , x y ) ϕ i i Ø Infer the optimal labeling of the MRF. ( , x x ) ψ Perceptual and Sensory Augmented Computing i j • Many inference algorithms are available, e.g. Ø Gibbs sampling, simulated annealing Ø Iterated conditional modes (ICM) Ø Variational methods Ø Belief propagation Computer Vision WS 08/09 Ø Graph cuts • Recently, Graph Cuts have become a popular tool Ø Only suitable for a certain class of energy functions Ø But the solution can be obtained very fast for typical vision problems (~1MPixel/sec). Slide credit: B. Leibe

  58. Graph Cuts for Optimal Boundary Detection • Idea: convert MRF into source-sink graph t n-links a cut Perceptual and Sensory Augmented Computing hard constraint hard constraint Computer Vision WS 08/09 s Minimum cost cut can be computed in polynomial time (max-flow/min-cut algorithms) [Boykov & Jolly, ICCV ’ 01] Slide adapted from Yuri Boykov

  59. Simple Example of Energy Regional term Boundary term E ( L ) D ( L ) w ( L L ) ∑ ∑ Perceptual and Sensory Augmented Computing = + ⋅ δ ≠ p p pq p q p pq N ∈ t-links n-links I ⎧ Δ ⎫ pq w exp = − ⎨ ⎬ pq t 2 2 σ a cut ⎩ ⎭ D p ( t ) Computer Vision WS 08/09 σ I Δ pq L p ∈ { t s , } D p ( s ) s (binary segmentation) Slide credit: Yuri Boykov

  60. Adding Regional Properties t n-links a cut D p ( t ) Perceptual and Sensory Augmented Computing t-link w t-link pq D p ( s ) s Computer Vision WS 08/09 Regional bias example s t I and I Suppose are given 2 2 ( ) s 2 D ( s ) exp || I I || / ∝ − − σ “ expected ” intensities p p 2 2 ( ) t 2 of object and background D ( t ) exp || I I || / ∝ − − σ p p NOTE: hard constrains are not required, in general. [Boykov & Jolly, ICCV ’ 01] Slide credit: Yuri Boykov

Recommend


More recommend