unsupervised learning of object deformation models
play

Unsupervised Learning of Object Deformation Models Iasonas Kokkinos - PowerPoint PPT Presentation

Unsupervised Learning of Object Deformation Models Iasonas Kokkinos and Alan Yuille Center for Image and Vision Sciences Seminar, Department of Statistics, UCLA Oct. 2007 Work appearing in ICCV 07 Modelling ShapeVariation via Deformations


  1. Unsupervised Learning of Object Deformation Models Iasonas Kokkinos and Alan Yuille Center for Image and Vision Sciences Seminar, Department of Statistics, UCLA Oct. 2007 Work appearing in ICCV 07

  2. Modelling ShapeVariation via Deformations • Top-down approach: – Model variation in shape and appearance separately (`one thing at a time’) – Shape modelling via deformations I(S(x)) = T(x) S(X) X Template Instance • Success Stories: – Deformable Templates, Active Contours. – ASM-AAM models. – MRF models for object detection/tracking. • Our goal: Learn without manual annotations.

  3. Mainstream on Learning Object Models • Appearance and shape of interest points. + Efficient, scale-invariant detection + Automated learning of object detection models - Not really generative models - Therefore unsuitable for - Segmentation - Tracking/analysis - Other tasks in the `pipeline’ - Questionable shape models (Gaussians for fixed set of interest points) Parallelograms: All contour info can be recovered. 4 point model? But why require `magic picture’ skills?

  4. Primal Sketch Representation • Image sketch: low-level summary of pixel information – Marr, Perceptual grouping, Lindeberg, Guo, Wu & Zhu. • Lindeberg Edges & Ridges: Focus on Scale Invariance Scale Smoothed Image Edge Strength Ridge Strength • Remove appearance variation, focus on shape. • Extract information related to boundaries/symmetry axes. • Use as features for both modelling and detection.

  5. Datasets • Overall goal is to learn models and use them for object detection. • Datasets from detection benchmarks were used, containing unsegmented images with significant noise and occlusions. Cars: Aggarwal and Roth, ECCV 02 Horses: Borenstein & Ullman, ECCV 02 Cows: Leibe & Schiele, ECCV 04 Faces: Fergus et al, CVPR Hands: Gomez & Stegmann, imm.dtu.dk

  6. Learning Active Appearance Models

  7. AAMs: Linear, Global Deformation Models • Composite deformations modelled as superposition of simpler ones. • Combination of Shape : & Texture: for Synthesis: • Model fitting: – Minimization of – Stochastic GD, Newton Raphson, Inverse Compositional • Model Learning?

  8. Unsupervised Learning of AAMs • Goal: register training images with prototype template. – Unknowns: template, deformation basis and coefficients. • Previous Work: – Vetter, Jones & Poggio, CVPR 97: Bootstrapping • Iterate: AAM fitting - optical flow - PCA on optical flow – Cootes et al. ECCV 2004: Diffeomorphisms • Guarantee 1-1 mapping, but not of the typical PCA type. – Baker, Matthews, Schneider, PAMI 04: Coding Length • Global reconstruction criterion. • Our Contributions – EM Formulation – Mean Shift Clustering for Eigenvector Initialization – `Feature Transport’ PDE.

  9. AAM Learning Block Diagram Mean Shift Clustering Input Images New Eigenvector E: Deform M: Update s Primal Sketch T S AAM Fit EM Loop Registrations Model Parameters

  10. EM Approach to Learning AAMs • Starting Points – ‘Coding’ criterion (BMS ’04) • EM- formulation • Parameters: Synthesis model for deformations and template. • Hidden variables: Coefficients matching template with individual images. • EM-based minimization: – E-step: Posterior on hidden variables, given parameters. • model fitting, estimate – M-step: Maximize expected log-likelihood • maximize observation likelihood w.r.t. basis elements & template.

  11. Mean Shift Clustering for Initialization • Alternative views of Primal Sketch Contours: – 2D images, 1D contours, 0D point sets. • Phrase registration as clustering of points. • Nonparametric clustering: Mean Shift – Variation: remove motion component along contour orientation. – Collapses contour `spaghetti’ onto single contour. – Used for eigenvector initialization and template construction.

  12. Feature Transport PDE • Problem Addressed: – Deformations can `swallow’ template features. – Instead of matching template to image, its features are hidden. • `Feature Transport’ idea: do not `accelerate’ across features. – Constraint on deformation field – Project onto nearest function satisfying constraint – Calculus of Variations:

  13. Deformation Eigenmodes

  14. Registration Results: Improvements in Template Clarity

  15. Quantitative Results • 50 Images from each category, 18-50 landmarks per image. • Error measure: – Backward wrap images to template coordinate system. – Calculate covariance of landmark locations. – Estimate `radius’ of containing circle Green: Only translation Blue: Learned model Red: Manual Model

  16. Learning Part-Based Models

  17. Part-Based Deformation models • AAM limitations: – Global deformation models. – No occlusion or layered appearance modelling. – Local minima due to greedy fitting. • Part-Based Models (Graphical Models): – Each object part corresponds to a node on the graph. – Node state: parameters of local deformation model – Clique potentials: kinematic constraints. Deformed Template Template Deformed State of i • Divide-and-conquer-type modelling of deformations. – Small set of simple models. – Inference can avoid local minima, handle occlusions.

  18. Model Initialization • Part Detection: – Cluster ridges ( symmetry axes ) using Mean Shift – But now only move along contour orientation – Detected Parts: • Initial parameter estimates: obtained from AAM fitting results.

  19. Learning the Model via EM. • Split problem unknowns – Hidden variables: Node states for each individual image. – Parameters: clique expressions, network structure, template. • E-step: Estimate posterior distribution on hidden states – Tree-structured model. – Inference on graphs. • M-step: Use posterior to update model parameters – Hinged joint model estimation for clique potentials (Least Squares). – Structure Learning (Minimum Spanning Tree). – Learning the observation model (Niblack Thresholding).

  20. Nonparametric Belief Propagation for the E-step • Belief propagation algorithm: circulate messages in graph Problem! • NBP (Sudderth Ihler et. al, CVPR’03): – Sample-based approximation to messages. – Gibbs-sampling based computation of product between messages. – Posterior on nodes: • Still, problematic if we must evaluate by cropping, rotating, and rescaling e.g. 100 image patches for each node.

  21. Speeding Up the E-step • Use binary templates in appearance model (Niblack thresholding). • For each state being summed over: – Deform part template correspondingly – Sum ridge/edge strength within/outside template interior. S 2 S 1 T T S 3 S 1 S 3 S 2 Binary Template Ridge interior Edge interior – Use as feature in observation potential expression. • Key idea: replace summations using Stoke’s theorem:

  22. Part-Based Model Results • Top-down Syntheses: – Sketch image using most likely samples from the node posteriors. • Quantitative Results: – Better than AAM, almost as good as manual model Cows Horses 10 20 AAM AAM MRF−M MRF−M MRF−U MRF−U RCD RCD 5 10 0 0 10 20 30 10 20 30 40 Landmark # Landmark #

  23. Top-down fitting results

  24. Conclusions & Discussion • Modelling deformations – Basic prerequisite for building accurate models. – Requires `handholding’ and careful design. • Primal Sketch greatly facilitates modelling – Amenable to Mean Shift clustering – Averaging over training set provides boundaries & symmetry axes. – Facilitates part detection by clustering symmetry axes. – Removes appearance information. • Learning difficulty & performance: – Learned AAM performed equally with manually trained AAM. – Learning Part-Based models is feasible, but there is place for improvement.

  25. Future Work • Use for detection – How can we combine the sparse set of primal sketch tokens to detect an object? • Extend learning: – Use 1-D aspect of primal sketch. – Learn the hierarchy of parts building up the object? – Allow alternative structures (And-Or graph)? • Use top-down models for segmentation – Top-down filling in does most of the work – Use hallucinated edges and ridges to segment the image CVPR 06 ICCV 07 • Model appearance variation.

Recommend


More recommend