factored shapes and appearances for parts based object
play

Factored Shapes and Appearances for Parts-based Object Understanding - PowerPoint PPT Presentation

Factored Shapes and Appearances for Parts-based Object Understanding S. M. Ali Eslami Christopher K. I. Williams British Machine Vision Conference September 2, 2011 Classification Localisation Segmentation This talks focus


  1. Factored Shapes and Appearances for Parts-based Object Understanding S. M. Ali Eslami Christopher K. I. Williams British Machine Vision Conference September 2, 2011

  2. Classification

  3. Localisation

  4. Segmentation

  5. This talk’s focus (Panoramio/nicho593) Segment this 6

  6. 7

  7. 7

  8. Outline 1. The segmentation task 2. The FSA model 3. Experimental results 4. Discussion 8

  9. The segmentation task The image X The segmentation S 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 1 0 1 9

  10. The segmentation task 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 1 0 1 The image X The segmentation S The generative approach ◮ Construct a joint model of X and S parameterised by θ : p ( X , S | θ ) ◮ Learn θ given dataset D train : arg max θ p ( D train | θ ) ◮ Return probable segmentation S test given X test and θ : p ( S test | X test , θ ) 10

  11. The segmentation task 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 1 0 1 The image X The segmentation S The generative approach ◮ Construct a joint model of X and S parameterised by θ : p ( X , S | θ ) ◮ Learn θ given dataset D train : arg max θ p ( D train | θ ) ◮ Return probable segmentation S test given X test and θ : p ( S test | X test , θ ) Some benefits of this approach ◮ Flexible with regards to data: ◮ Unsupervised training, ◮ Semi-supervised training. ◮ Can inspect quality of model by sampling from it. 10

  12. Factored Shapes and Appearances Goal Construct a joint model of X and S parameterised by θ : p ( X , S | θ ). Factor appearances ◮ Reason about object shape independently of its appearance . 11

  13. Factored Shapes and Appearances Goal Construct a joint model of X and S parameterised by θ : p ( X , S | θ ). Factor appearances ◮ Reason about object shape independently of its appearance . Factor shapes ◮ Represent objects as collections of parts , ◮ Systematic combination of parts generates objects’ complete shapes. 11

  14. Factored Shapes and Appearances Goal Construct a joint model of X and S parameterised by θ : p ( X , S | θ ). Factor appearances ◮ Reason about object shape independently of its appearance . Factor shapes ◮ Represent objects as collections of parts , ◮ Systematic combination of parts generates objects’ complete shapes. Learn everything ◮ Explicitly model variation of appearances and shapes. 11

  15. Factored Shapes and Appearances Schematic diagram 0 1 2 0 1 2 v S A X 12

  16. Factored Shapes and Appearances Graphical model number of images n L parts θ s pixels in each image D n Parameters s d v θ s – shape statistics θ a – appearance statistics a ℓ x d θ a Latent variables L D a ℓ – per part appearance v – global shape type s – segmentation 13

  17. Factored Shapes and Appearances Shape model θ s n s d v a ℓ x d θ a L D D � p ( X , A , S , v | θ ) = p ( v ) p ( A | θ a ) p ( s d | v , θ s ) p ( x d | A , s d , θ a ) d =1 14

  18. Factored Shapes and Appearances Shape model θ s n s d v a ℓ x d θ a L D D � p ( X , A , S , v | θ ) = p ( v ) p ( A | θ a ) p ( s d | v , θ s ) p ( x d | A , s d , θ a ) d =1 14

  19. Factored Shapes and Appearances Shape model Continuous parameterisation exp { m ℓ d } p ( s ℓ d = 1 | v , θ ) = L � exp { m kd } k =0 Efficient ◮ Finds probable assignment of pixels to parts without having to enumerate all part depth orderings. ◮ Resolve ambiguities by exploiting knowledge about appearances. 15

  20. Factored Shapes and Appearances Handling occlusion m 2 m 1 1 0 m 0 16

  21. Factored Shapes and Appearances Handling occlusion m 2 m 1 1 0 m 0 S S 0 1 2 0 1 2 A A X 16

  22. Factored Shapes and Appearances Learning shape variability Goal Instead of learning just a template for each part, learn a distribution over such templates. Linear latent variable model Part ℓ ’s mask m ℓ is governed by a Factor Analysis-like distribution: p ( v ) = N ( 0 , I H × H ) m ℓ = F ℓ v + c ℓ , where v ℓ is a low-dimensional latent variable, F ℓ is the factor loading matrix and c ℓ is the mean mask. Shape parameters θ s = {{ F ℓ } , { c ℓ }} . 17

  23. Factored Shapes and Appearances Appearance model θ s n s d v a ℓ x d θ a L D D � p ( X , A , S , v | θ ) = p ( v ) p ( A | θ a ) p ( s d | v , θ s ) p ( x d | A , s d , θ a ) d =1 18

  24. Factored Shapes and Appearances Appearance model θ s n s d v a ℓ x d θ a L D D � p ( X , A , S , v | θ ) = p ( v ) p ( A | θ a ) p ( s d | v , θ s ) p ( x d | A , s d , θ a ) d =1 18

  25. Factored Shapes and Appearances Appearance model Goal Learn a model of each part’s RGB values that is as informative as possible about its extent in the image. Position-agnostic appearance model ◮ Learn about distribution of colours across images, ◮ Learn about distribution of colours within images. 19

  26. Factored Shapes and Appearances Appearance model Goal Learn a model of each part’s RGB values that is as informative as possible about its extent in the image. Position-agnostic appearance model ◮ Learn about distribution of colours across images, ◮ Learn about distribution of colours within images. Sampling process For each part: 1. Sample an appearance ‘class’ for the current part, 2. Sample the part’s pixels from the current class’ feature histogram. 19

  27. Factored Shapes and Appearances Appearance model Training data π φ ℓ = 0 ℓ = 1 ℓ = 2 20

  28. Factored Shapes and Appearances Learning Use EM to find a setting of the shape and appearance parameters that approximately maximises their likelihood given the data p ( D train | θ ): 1. Expectation: Block Gibbs and elliptical slice sampling (Murray et al., 2010) to approximate p ( Z i | X i , θ old ), 2. Maximisation: Gradient descent optimisation to find arg max θ Q ( θ , θ old ) where n Q ( θ , θ old ) = � � p ( Z i | X i , θ old ) ln p ( X i , Z i | θ ) . i =1 Z i 21

  29. Related work FACTORED FACTORED SHAPE SHAPE APPEARANCE PARTS AND APPEARANCE VARIABILITY VARIABILITY LSM Frey et al. - � (layers) � (FA) � (FA) Sprites Williams and Titsias - - - � (layers) LOCUS Winn and Jojic - � � (deformation) � (colours) MCVQ Ross and Zemel - - � � (templates) SCA Jojic et al. - � � (convex) � (histograms) FSA � (softmax) � � (FA) � (histograms) 22

  30. Outline 1. The segmentation task 2. The FSA model 3. Experimental results 4. Discussion 23

  31. Learning a model of cars Training images 24

  32. Learning a model of cars Model details ◮ Number of parts L = 3, ◮ Number of latent shape dimensions H = 2, ◮ Number of appearance classes K = 5. 25

  33. Learning a model of cars Model details ◮ Number of parts L = 3, ◮ Number of latent shape dimensions H = 2, ◮ Number of appearance classes K = 5. X S 25

  34. Learning a model of cars Shape model weights ℓ = 2 F 2 column 1 F 2 column 2 Convertible ← → Coup´ Low ← → High e 26

  35. Learning a model of cars Latent shape space +3 0 -3 +3 0 -3 blank 27

  36. Learning a model of cars Latent shape space +3 0 -3 +3 0 -3 Saloon – Hatchback – Convertible – SUV 28

  37. Other datasets Training data Mean model FSA samples 29

  38. Other datasets +2 0 -2 +2 0 -2 30

  39. Segmentation benchmarks Datasets ◮ Weizmann horses : 127 train – 200 test. ◮ Caltech4 ◮ Cars: 63 train – 60 test, ◮ Faces: 335 train – 100 test, ◮ Motorbikes: 698 train – 100 test, ◮ Airplanes: 700 train – 100 test. Two variants ◮ Unsupervised FSA : Train given only RGB images. ◮ Supervised FSA : Train using RGB images and their binary masks. 31

  40. Segmentation benchmarks Weizmann Caltech4 Horses Cars Faces Motorbikes Airplanes GrabCut Rother et al. 83.9% 45.1% 83.7% 82.4% 84.5% Borenstein et al. 93.6% - - - - LOCUS Winn et al. 93.1% 91.4% - - - Arora et al. - 95.1% 92.4% 83.1% 93.1% ClassCut Alexe et al. 86.2% 93.1% 89.0% 90.3% 89.8% Unsupervised FSA 87.3% 82.9% 88.3% 85.7% 88.7% Supervised FSA 88.0% 93.6% 93.3% 92.1% 90.9% Competitive – despite lack of CRF-style pixelwise dependency terms. 32

  41. Summary FSA is a probabilistic, generative model of images that ◮ Reasons about object shape independently of its appearance , ◮ Represent objects as collections of parts , ◮ Explicitly models variation of both appearances and shapes. Object segmentation with FSA is competitive. The same FSA model can potentially also be used to ◮ Classify objects into sub-categories (using latent v variables), ◮ Localise objects (using a sliding window or branch and bound), ◮ Parse objects into meaningful parts. 33

  42. Questions

  43. Learning a supervised model of cars Latent shape space +3 0 -3 +3 0 -3 35

Recommend


More recommend